CV
Education
- Ph.D. in Applied Mathematics, Statistics, and Machine Learning, Paris, 2019 - Current
- Main advisors: Wojciech Pieczynski and Emmanuel Monfrini
- Title: Generative Probabilistic Models: Discriminative classifiers and Neural Networks
- Thesis at Hadamard Doctoral School of Mathematics, common to Paris-Saclay University (ranked #1 in Mathematics in the Shanghai Ranking) and Institut Polytechnique de Paris (ranked #31), at the SAMOVAR lab of Telecom SudParis
- Funding: 3 years grant from IBM
- School link
- M.Sc. in Data Science and Machine Learning, ENSAE Paris, 2016 - 2018
- ENSAE Paris is considered as one of the top 3 engineering school in Statistics in France
- School link
- M.Sc. in Statistics, Telecom SudParis, 2014 - 2018
- Preparatory School MPSI-PSI*, Lycée Thiers Marseille, 2012 - 2014
Experience
- Ph.D. Student, IBM and Telecom SudParis, 2019 - Current
- About the ability to any generative probabilistic model to define a discriminative classifier, parametrized with neural networks, with application to Natural Language Processing. These thesis allows to develop neural models based on Probabilistic Graphical Models, combining great modeling potential, performances, and easy to serve.
- Some research projects:
- Developing extensions of the classifier induced from the Naive Bayes for Text Classification and Sentiment Analysis, dividing the error by 4.5 while keeping the same complexity.
- Developing new algorithms for Hidden Markov Chain, allowing to consider complex features and achieving relevant results for Part-Of-Speech Tagging, Named-Entity-Recognition, state-of-the-art results for Chunking, and being easy to serve.
- Showing the equivalence between the linear-chain Conditional Random Field and the Hidden Markov Chain.
- Tech: Python, PyTorch, Transfomers, Flair
- Data Scientist Consultant, IBM France, November 2018 - April 2019
- Data scientist developing a long-tail chatbot using state-of-the-art Question Answering models (BERT) in French coupled with a document retriever module (based on Elastic Search)
- Other projects: clustering of unsupervised data, mail classification, data analysis
- Tech: Python, Tensorflow, Keras, Jupyter, Flask, Scikit-learn, GoogleCP, ElasticSearch }
- Data Scientist Intern, IBM France, May 2018 - November 2018
- Development of a database enrichment model lying on user questions, based on the sequential neural model BiLSTM, then BERT, query analysis, and word embedding methods GloVe and FastText.
- Others: research intern during summer 2016 and summer 2017 at Telecom SudParis - Paris, and developer intern during summer 2015 at Bo Digitize Ltd - Tel Aviv
Publications
- E. Azeraf, E. Monfrini, and W. Pieczynski, “Improving usual Naive Bayes classifier performances with Neural Naive Bayes based models,” accepted at International Conference on Pattern Recognition Applications and Methods (ICPRAM), 2022
- E. Azeraf, E. Monfrini, and W. Pieczynski, “On equivalence between linear-chain conditional random fields and hidden Markov chains,” accepted at 14th International Conference on Agents and Artificial Intelligence (ICAART), 2022
- E. Azeraf, E. Monfrini, and W. Pieczynski, “Using the Naive Bayes as a discriminative model,” 13th International Conference on Machine Learning and Computing (ICMLC 2021), Association for Computing Machinery, 2021, pp. 106–110
- E. Azeraf, E. Monfrini, E. Vignon, and W. Pieczynski, “Highly Fast Text Segmentation With Pairwise Markov Chains,” 2020 6th IEEE Congress on Information Science and Technology (CiSt), IEEE, 2020, pp. 361-366}
- E. Azeraf, E. Monfrini, E. Vignon, and W. Pieczynski, “Introducing the Hidden Neural Markov Chain Framework,” 13th International Conference on Agents and Artificial Intelligence (ICAART), 2021, pp. 1013-1020
Submitted
- E. Azeraf, E. Monfrini, E. Vignon, and W. Pieczynski, “Hidden Markov Chains, Entropic Forward-Backward, and Part-Of-Speech Tagging.”, submitted at AISTATS 2022
- E. Azeraf, E. Monfrini, and W. Pieczynski, “Deriving discriminative classifiers from generative models,” submitted at Transactions for Pattern Analysis and Machine Intelligence
Awards
- Best Ph.D. student 2020 award of Telecom SudParis, winning an iPhone 11, link
- Best internship 2018 award of ENSAE Paris, winning a reward of 1000€
Miscellaneous
- Languages: English (fluent), French (native)
- Teaching: Head Teacher of Probabilistic Models and Machine Larning, M2 course at Telecom Paris, and Assistant Teacher of many courses at ENSAE Paris, ENSTA Paris, and Telecom SudParis (L3, M1 and M2 levels), always highly appreciated by students. All these schools are among the top engineering schools in France.
- Interest: Passionate about comics, especially DC and Marvel; The Dark Knight Returns by F. Miller, Spider-Man Blue by J. Loeb and T. Sale, and Justice League: Forever Evil are among my favorite works.
- Cloud certifications: Preparing Azure cloud for Machine Learning certifications
- References:
- Emmannuel Monfrini, Full Professor at Telecom SudParis
- Emmanuel Vignon, ex-Watson Practice Leader at IBM France, CEO of Sicara
- Vincent Giraud, Senior Managing Consultant at IBM France