The Fixed income data team is responsible for monetizing data generated by Citi's fixed income businesses and building data analytics tools/services that provide actionable insights with direct impact on revenue.
The Lead Data Engineer will be responsible for designing, implementing, and optimizing distributed data processing jobs to handle large-scale data in Hadoop Distributed File System(HDFS) and S3 Storage using Apache Kafka, Flink Java and Flink SQL, Apache Spark and Python. This role requires deep understanding of data engineering principles, proficiency in Java, Python and hands-on experience with Kafka and S3 ecosystems. Developer will collaborate with data engineers, analysts, and business stakeholders to process, transform and drive insights and data driven decisions.
Responsibilities:
Subject Matter Expert (SME) in Finance and Risk data with experience in data processing jobs to handle large-scale data in Hadoop Distributed File System (HDFS), S3 Storage using Apache Spark, Apache Kafka Streaming, Apache Flink Java, Flink SQL and Python.
Good programming skills in Java, SQL and Python
Distribute data to downstream systems by generating feeds or publishing to Kafka topics
Required to support situations in which end user consultation is required to identify system function specifications and incorporate them into overall system design and delivery. Additionally, utilize comprehensive knowledge of multiple areas within technology to achieve technological objectives.
Expected to effectively communicate those risks to the business owners, so that they can make informed decisions.
Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas
Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems
Ensure essential procedures are followed and help define operating standards and processes
Serve as advisor or coach to new or lower level analysts
Has the ability to operate with a limited level of direct supervision.
Can exercise independence of judgement and autonomy.
Acts as SME to senior stakeholders and /or other team members.
Qualifications:
8+ years of relevant experience in Hadoop Distributed File System(HDFS) using Apache Spark, Python, Java and SQL
2+ years of relevant experience in S3 Storage using Apache Kafka, Flink Java and Flink SQL with minimal latency, monitor and optimize the performance of Kafka clusters, troubleshoot and resolve issues related to Kafka and data processing, implement best practices for Kafka architecture and operations
Experience in systems analysis and programming of software applications
Experience in managing and implementing successful projects
Working knowledge of consulting/project management techniques/methods
Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
Strong communication skills and attention to detail and accuracy.
Demonstrated leadership skills.
Basic knowledge of industry practices and standards
Consistently demonstrates clear and concise written and verbal communication
Education:
Bachelor's degree/University degree or equivalent experience
Prior Financial industry experience will be a plus
This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.
Additional Responsibilities:
Data Processing and Transformation:
Design and implement big data warehouse application to process and transform large datasets
Develop ETL Pipelines with Apache Kafka, Flink, Spark, Python for data Ingestion, cleaning, aggregation, and transformations.
Data Distribution:
- Send data to downstream systems by generating feeds or publishing to Kafka topics
Performance Optimization:
Optimize ETL jobs for efficiency, reducing run time and resource usage.
Finetune memory management, caching, and partitioning strategies for Optimal performance
Data Engineering with Hadoop, Spark, Kafka, Flink:
Load data from different sources into S3 Storage, ensuring data accuracy and integrity.
Testing and debugging:
Troubleshoot and debug Kafka Job failures, monitor job logs, and Kaka UI Manager to Identify Issues.
Coding standard adherence:
Coding vulnerabilities identification and addressing. Enforcement of the coding standard to eliminate code vulnerabilities.
Bigdata best practice adherence including small files elimination, Hive SRE scan success and archival implementation for ideal architecture utilizations.
Job Family Group:
Technology
Job Family:
Applications Development
Time Type:
Full time
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity reviewAccessibility at Citi (***/citi/accessibility/application-accessibility.htm).
View Citi'sEEO Policy Statement (***/global/eeo-aa-policy)and theKnow Your Rights (/eeoc.gov/sites/default/files/2023-06/22-088_EEOC_KnowYourRights6.12ScreenRdr.pdf)poster.
Citi is an equal opportunity and affirmative action employer.
Minority/Female/Veteran/Individuals with Disabilities/Sexual Orientation/Gender Identity.
•