EDUCATION

COLUMBIA UNIVERSITY, NY
MASTER OF SCIENCE

DATA SCIENCE
CLASS OF 2018


VIT UNIVERSITY, IN
BACHELOR OF TECHNOLOGY

COMPUTER  SCIENCE AND ENGINEERING
CLASS OF 2017

 

WORK EXPERIENCE

 

February 2019 - Present

DATA SCIENTIST 

IBM DATA SCIENCE & AI ELITE, NEW YORK

Team :

International team of Data Science and AI Experts engaging with clients to transform business with AI.

  • Researching in Reinforcement Learning and Graphical models.

  • Actively working with clients to promote IBM technology and adoption of machine learning and AI in their organization.

September 2018 - December 2018

DATA SCIENTIST - CAPSTONE PROJECT

GOLDMAN SACHS, NEW YORK

  • Capstone project via Columbia University (With Jared Peterson @ Goldman Sachs) 

  • Developing a model for finding the Similarity Indexes of various Medical Institutes based on their clinical trial data using the open source AACT Database. (Text analytics and exploring alternative data)

June 2018 – December 2018

DATA SCIENCE & MACHINE LEARNING INTERN

IBM, NEW YORK

Researching on Reinforcement Learning, and implementation of business use-cases. 

  • Reinforced Trading model for Long/ Short trades and portfolio management.

  • Text clustering model using content structuring with ID-LSTM and HS-LSTM, using actor-critic model.

June 2018 – December 2018

RESEARCH ASSISTANT

COLUMBIA UNIVERSITY MEDICAL CENTER, NEW YORK

Working on patient's EHR and time series data of heart rate, blood pressure and patient's health response during the operation. The goal is to develop a forecasting model to detect the seizure of patients; using multi-level model. The time series data would be analyzed using dynamic time warping methods due to varying lengths and unequal intervals of the data.

January 2018 - May 2018

RESEARCH ASSISTANT

COLUMBIA UNIVERSITY - SCIKIT LEARN COLLABORATION

Analysis and extension of scikit- learn ( Guide : Prof. Andreas Mueller)

Scikit-learn is a Python machine learning library containing a large collection of machine learning models, as well as evaluation metrics and tools for implementing machine learning workflows.

The goal of this project is to analyze the current usage of scikit-learn on a large scale (i.e. the scale of all open-source code, even all public code), and extend the library based on the findings; to identify usage patterns, problematic use cases, and ways to improve the interface.

Sept – Dec 2017

RESEARCH ASSISTANT

COLUMBIA BUSINESS SCHOOL, NEW YORK

Assisting Cheryl Einhorn for her research project in AREA Method Study- The Decision Making process. (Confidential Project)

May - June 2017

DATA SCIENCE INTERN

ITECH MISSION, INDIA

Worked as a Data Scientist for analysing the UN Dataset on the Sustainable Development Goals. Analysis using D3JS , Microsoft Power BI, Tableu, Customized MS Excel for complex analytics and visualizations. UN DESA awarded iTM the winner of global competition for analyzing and visualizing data of SDG14 Commitments that were announced in the lead-up, and during, The Ocean Conference, held on 5-9 June 2017 at UN Headquarters, New York.

May - July 2016

MACHINE LEARNING INTERN

MICROSOFT IDC, INDIA 

Used BotFramework to make a ChatBot in VisualC# for Microsoft Redmond office, using Natural Language Processing, Language Understanding Intelligent  Systems and REST APIs.

Jan - Feb 2016

DATA ANALYST INTERN

INTERNATIONAL MANAGEMENT INSTITUTE, INDIA

I worked with Prof. V. K. Seth, who did his MA and PhD from the Delhi School of Economics and was working as a professor at the International Management Institute, New Delhi. Earlier, he was a professor at the Faculty of Management Studies, University of Delhi. He has more than four decades of experience in teaching, research and consultancy.

● Did extensive research on capacity utilization of goods and Index of Industrial Production.
● Read many research papers and company white papers to gain knowledge.
● Made theoretical hypothesis regarding the IIP growth, and to prove the hypothesis, did data analytics.
● The analytics included Time Series and Wharton Index of Growth for Capacity Utilization of Goods in Indian Industry using Index of Industrial Production.

Dec 2015 - Feb 2016

DATA RESEARCH ANALYST INTERN

NATIONAL INFORMATICS CENTER, INDIA

  • Data set of a restaurant-customer ratings was collected through internet sources. (data.gov.in)

  • The analysis had to be done to find the dependent factors which influence the overall rating, food rating and service rating of the restaurant. 

  • Predictive analysis needs to be applied to find response of customer in a different scenario. 

  • In the pre-processing of the dataset, qualitative information was converted to quantitative data; including data cleaning.The dataset was converted into multiple table using the database approach- entity relationship diagram.

  • Supervised learning methods of correlation, linear regression and ANOVA was done. Later, unsupervised learning method was also applied. Unsupervised learning does not use previously known results to train its models. Rather, it uses descriptive statistics to examine the natural patterns and relationships that occur within the data and does not predict a target value. 

  • Another data set of Mutual Fund Performance for 8 years was collected.

  • The growth rates of the mutual fund are influenced by various factors, which are mostly indirect, like the technological development in industries, industrial growth, gross profit margin in the industry etc. in which the mutual fund company has invested its bonds. For the customer to buy a mutual fund bond, the Net Value of the bond should be high, so as the person who is investing gets higher returns.

  • For the purpose, predictive and time series analysis was applied for model generation.

Jan - Aug 2015

BUSINESS ANALYST INTERN

EY, INDIA

  • The company hired me for social media and business analytics- highly confidential.

  • The dataset was entirely generated by extracting information from the Twitter profile of the clients for whom the analytics was performed. For the purpose, Twitter Streaming API (text mining) was used.

  • The data was wired to the RStudio using twitteR library and the analytics was coded using R language. 

  • Various analytical methods were applied, namely, word cloud where an image composed of words used in a particular text or subject, in which the size of each word indicates its frequency or importance; sentiment analysis which is a process of computationally identifying and categorizing opinions expressed in a piece of text, especially in order to determine whether the writer's attitude towards a particular topic, product, etc. is positive, negative, or neutral.

  • Later User Activity Monitoring (UAM) was also performed to visualize the activeness of the user in making posts, sharing posts, liking and commenting using info graphics.

  • All the data inferences were converted into one-sliders which consisted of all information as graphics for each client being inspected.

Jun - July 2015

MARKET DATA ANALYST INTERN

INTERNATIONAL MANAGEMENT INSTITUTE, INDIA

  • A survey was conducted among various Stock market investors to analyse their investment behavior.

  • The data collected was converted from qualitative to quantitative for ease in analysis. Data cleaning and pre-processing was performed.

  • The aim of the project was to analyse and build models for Customer Satisfaction and Profit Prediction & Enhancement for Stock Market.

  • The analysis was done using IBM SPSS.

Mar - May 2015

MARKET DATA ANALYST

TATA CONSULTANCY SERVICES, INDIA 

  • Data from NSE and BSE websites of Sensex and Nifty was collected. 

  • Data values of various parameters like gold rates, industrial production, textile rates, mining etc were extracted.

  • The purpose was to find a relationship, as to how the growth or decline in various industries can effect the Sensex and Nifty rates to go up or down.

  • Predictive modeling using regression and time series analysis was applied.

  • The analysis was done in IBM SPSS.

Dec 2014 - Jan 2015

RESEARCH INTERN - CLOUD COMPUTING

TATA CONSULTANCY SERVICES, INDIA

  • I worked for the client DirecTV, USA for modeling their transmission via cloud platform.

  • Extensive research was done to understand the encryption algorithms, video streaming, real time transmission using cloud computing etc.

  • A cloud architecture was designed to do receive data from satellite, perform video encryption and transmit it in real time to the receiver set top boxes.

Jun - July 2014

RESEARCH & DEVELOPMENT INTERN

DELHI UNIVERSITY, INDIA

  • Extensive research on forensic science and handwriting analysis was done.

  • Handwriting features and their dependency with the human personality was studied. 

  • A survey was conducted where uniform handwriting samples of families were collected, father, mother, grandparents, kids and siblings; in order to analyse upon the characteristics of handwriting that are inherited.

  • Later, a software was developed using MATLAB, where handwriting text can be recognized, similar to OCR and handwriting features like slant, kurtosis, skewness, margin gap, inter-word and inter-line spaces were analysed.

  • Our research and he results were well-appreciated by a lot of researchers, and we got our work published in International Refereed Journal of Engineering and Science.

PATENT & RESEARCH PUBLICATIONS

PATENT

MACHINE TRADING USING REINFORCEMENT LEARNING

PATENT FILED : SEP 21, 2018  

 

PROJECTS & RESEARCH EXPERIENCES

GRADUATE RESEARCH ASSISTANT

Working under Prof. Andreas Mueller for analysis and extension of Scikit-Learn. The goal of this project is to analyze the current usage of scikit-learn on a large scale (i.e. the scale of all open-source code, even all public code), and extend the library based on the findings; to identify usage patterns, problematic use cases, and ways to improve the interface.

 

SPECIALIZATIONS & INTERESTS

DEEP LEARNING SPECIALIZATION

Andrew Ng

REINFORCEMENT LEARNING

David Silver

DEEP LEARNING BOOK

Ian Goodfellow

DEEP LEARNING USING TENSORFLOW &  KERAS

Google's workshop in San Jose

PHD LEVEL COURSE ON ADVANCED ALGORITHMS

Project on"Multi-scale exploration of Convex function and Bandit Convex Optimization"

INTERESTS

Data Science
Artificial intelligence
Machine learning
Algorithm development
Business Analytics
Data Visualization
Data mining/Big data analytics
Cloud computing

 

©2018 by The Peculiar Unicorn