• Hi!
    I'm Fatemeh

    I am currently working as an applied research scientist at Imagia.

    Download Resume

About Me

Who Am I?

Hi I'm Fatemeh. I am an applied research scientist at Imagia working mostly on NLP projects.

I have experience with Transformer based pre-trained language models such as BERT, RoBERTa, ELECTRA in different downstream Natural Language Processing (NLP) tasks (Classification, Regression, Summarization, Named Entity Recognition, and Information Retrieval).

I have been in the Computer Science field for more than seven years. In the first five years I was mainly concentrated on software development and after that changed my focus to data science projects. I am a confident, team player, and motivated researcher in my field. I have listed a selected set of my projects on my personal website.

I have experience with both general domain and biomedical text contexts. Also focused on transfer learning, Natural Language Processing (NLP), and deep learning. I am passionate about solving real-world problems with state-of-the-art research in the field.

What I do?

Here are some of my overall skills

Research

I have experience in research, designing and developing sophisticated ML solutions for real-world problems using traditional and cutting-edge ML solutions.

Data

Excellent knowledge of data exploration, data cleaning, collecting, generalizing, evaluation of models, data integration and data manipulating.

Skills

Knowing that technology is evolving fast, made me passionate about learning new concepts and skills. Ability to implement novel ideas.

My Specialty

My Technical Skills

My skills are divided into three sections: Professional, Intermediate, and Familiar. The skills in the Professional section are the tools that I'm working with regularly.

Professional

Programming Languages    Python
   Python Libraries PyTorch, Keras, MLflow
Hugging Face Transformers, spaCy, NLTK
NumPy, scikit-learn, pandas, Mathplotlib, jupyter notebooks
   Databases DBMS:  MySQL, SQL
Version control    Git,    GitHub,    GitLab
Operating system   Windows,    Linux
Writing Tools Microsoft Office , LATEX
Workflow Agile Development & Scrum
others    Slack,   Trello

Intermediate

Programming Languages    Java, C, C++
   PHP (CodeIgniter, Laravel)
   HTML, CSS, JavaScript (Bootstrap)
MASM Assembly
   Python Libraries pytest, py2neo, selenium, PyMongo
Cloud    Docker
Programming platforms    Android
   Databases DBMS:  postgreSQL, SQLserver(familiar)
NoSQL: Neo4j, MongoDB
Graphic Design Adobe Fireworks, Adobe photoshop, Camtasia
others Jibble

Familiar

Programming Languages MATLAB, R
Ontology SPARQL, Protégé
HDL Verilog
Education

Education

M.S. Computer Science May 2019 – April 2021

Dalhousie University, Halifax, Canada.
GPA: 4.07 (Out of 4.3), via 12 credit
Thesis title: "MTLV: A Library for Building multi-task learning Architectures"
Courses:

  • Advanced topics in NLP: A+
  • Deep Learning: A
  • Machine Learning: A-
  • Visual Analytics: A+

B.S. Software Engineering Sep 2013 – Feb 2018

Shiraz University, Shiraz, Iran.
GPA: 16.11 (out of 20), via 141 credit
Senior Project: Kitchen safety (IOT project)

Experience

Work Experience

Applied Research Scientist - NLP May 2021 – Present

Imagia, Montreal, Canada (Full-time)

  • Designing CLI tools for using Transformer-baed models in real-world data.
  • Working with Biomedical domain corpus.

Applied Research Scientist - NLP Nov 2020 – Apr 2021

Imagia, Montreal, Canada (Part-time)

  • Designing CLI tools for using Transformer-baed models in real-world data.
  • Working with Biomedical domain corpus.

NLP Research Assistant Mar 2019 – Apr 2021

Dalhousie University, Halifax, Canada

  • Using deep language models to solve downstream NLP tasks.
  • Worked on both individual and team project.
  • Leveraged my knowledge of Python to develop a CLI tool for multi-task learning

Data Analyst Aug 2018 – Feb 2019

Ayten Company, Shiraz, Iran
This comany is mostly focused on Software development and R&D projects.
Detailed achievements and working experience:

  • Part of social network analysis team, mostly worked with python, neo4j, and MongoDB (Py2neo and PyMongo library in python)
  • Scrum (software development framework)

Linux Instructor Jun 2015 – Aug 2016

Joyandegane Parto Nor Sadegh Company, Shiraz, Iran
This compnay is mostly focused on programming boards(Raspberry Pi) and developing IOT projects.

  • First started as an intern and then was selected to lead the training sessions.
  • Topics included: UNIX vs Linux, file manipulation on terminal, kernel description, file system, shell scripting, system administration and network basics, version controls.

Experience

Teaching Experience

Databases Lab Assistant Feb 2017 – May 2017

Shiraz University, Shiraz, Iran

  • Facilitated tutorial classes.
  • Presented lectures on PostgreSQL
  • Held office hours to help students with Psycopg, SQLAlchemy and Django.

My Work

Projects

Query-focused Extractive Summarization using pre-trained models January 2020 - April 2020

  • Course Project for Natural Language Processing(NLP) Course
  • Team Project (collaboration with a PhD student)
  • Used different scoring functions to extract most important sentences of a document
  • Download Project Report
Abstract: In the process of writing a research paper, researchers often spend a lot of time organizing and summarizing previous work related to the research. To help with this problem, the proposal of this project is to use Query-focused Extractive Summarization algorithms to produce relevant highlights in related research. For this project, the problem of insucient labeled data was solved by using pre-trained models such as BERT and BioBERT to produce accurate representations of words. To measure the validity of the approaches, they were applied to the BioASQ dataset of medical articles and obtained results consistent with each other using Cosine similarity and Euclidean distance, each with several pre-trained models. One of the challenges of this project was that producing the embeddings of a lot of sentences with a pre-trained model is a very time- consuming task, so a scalable tool was developed to eciently compute token embeddings for a variety of pre-trained models.

Classification Of Imbalanced Dataset Using BERT Embeddings May 2019 - Aug 2019

  • Course Project for Deep Learning Course
  • Team Project (collaboration with 2 MCS students)
  • Used BERT embeddings to classify the type of harrasment of each tweet in a imbalanced twitter dataset.
  • Download Project Report
Abstract: Online harassment is becoming prevalent as a specific type of communication on Twitter. Considering the huge amount of user-generated tweets each day, the problem of detecting and possibly limiting these contents automatically in real-time is becoming a fundamental problem. But often real-world datasets are imbalanced, comprising predominantly of “normal” examples and less number of “abnormal” ones which causes the learning algorithm to simply generate a trivial classifier that classifies every example as the majority class. To tackle this problem, we use SMOTE to oversample the embeddings of the minority classes where the embeddings are obtained from the BERT pretrained language model. Finally, we use these oversampled embeddings to train our bi-directional LSTM classifier model to categorize the tweets into four classes: non-harassment, sexual harassment, physical harassment and indirect harassment. Our experiments show that using SMOTE on the top layer representations of BERT significantly improves the F1 score than merely adjusting the class weights.

Visual Analysis Of Harassment Classification In Twitter May 2019 - Aug 2019

  • Course Project for Deep Learning Course
  • Team Project (Collaboration with 2 MCS students)
  • Developed a visualization system using D3 to demonstrate the semantic similarity of words.
  • Download Project Report
  • Although We used the same dataset for this project and the deep learning course, the methogologies and purpose of the projects are completly different.
Abstract: In our project, we develop a Deep Neural Network model to classify the tweets based on four classes: non-harassment, sexual harassment, physical harassment and indirect harassment. We then use Deep SHAP, a unified approach that explains the output of any machine learning model to interpret the predictions of our classifier. We demonstrate the semantic similarity of words in the tweet by visualizing the embeddings in low dimensions using t-SNE. To provide a quick, high level information of the similarities and anomalies between the categories, the final predictions of the model are summarized into different styles of TreeMaps.

Kitchen safety IOT project Dec 2017

  • Senior Project for Undergrad in Shiraz Univerity
  • webserver: Laravel Framework
  • website: CodeIgniter Framework, bootstrap
  • Download Project Report
  • GitHub
Descripiton: This project is designed to assist people who work in the kitchen such as, the kitchens of grand hotels, where the matter of temperature and heat turns out to be essential. They have giant refrigerators, freezers, a huge kitchen, and a couple of storage to be taken care of. To make sure that each section is being in an appropriate situation and the systems are functioning correctly, this project is designed to monitor a system to prevent any damage that might happen to these workers. In this system, a temperature and humidity sensor is located in the kitchen. The information that is being received from the environment will be sent out through a Wi-Fi module to a web server, the web server will store the data. Then users can watch theses information from the project website. This website also makes it available for the user to set a threshold in which the kitchen should not increase or decrease. If any change occurred out of the given threshold, the user will immediately be notified with an email. The history of the kitchen's temperature is also available to view anytime on the website.

Client and server project Jul 2017

  • Course project for Microprocessor Lab
  • Introduced to serial programming, and sending packet through client to server and vice versa;
  • Developed using python (server) and C (client);

File manipulation simulation Jul 2015

  • Course project for Assembly.
  • Designed to work with system calls.
  • Developed using assembly 8086 - 32 bit.

CMD simulator Jan 2013

  • Course project for Fundamentals of Computer Programming.
  • Got familiar with advanced programming skills in Python.
  • Developed a simulation of the commands of CMD.
Presentations

Presentations

Poster presentation Nov 2019

Rahimi F., Milios E., Matwin S., (2019, Nov). "Biomedical sentence-level and document-level representation learning", Poster session presented at the Canadian computing conference for Women inTechnology, Mississauga, ON

Awards

Awards

Awarded the support for attending virtual Grace Hopper Celebration (vGHC) 2020 Sep 2020

The Culture of Respect Committee in Computer Science (CoReCS) of Dalhousie Faculty of Computer Science awarded me the support to attend the virtual Grace Hopper Celebration (vGHC) 2020.

2nd place in Diversity and inclusion hackathon Feb 2020

Organizer: honor issuer ShiftKey Labs and Faculty of Computer Science (Dalhousie University)
Our team won the second place for presenting Real Align. Real Align is a software that creates a safe engagement platform focused on peer to peer support for people with disabilities to connect with each other.

Awarded the support for attending CAN-CWiC 2019 Nov 2019

I was awarded the support by the Culture of Respect Committee in Computer Science (CoReCS) of Dalhousie Faculty of Computer Science to both attend and present a poster in ACM Canadian Celebration of Women in Computing (CAN-CWIC).

Parya Scholarship Nov 2019

Parya Organization.
I won the Parya scholarship which is a scholarship for Iranians studying in Canadian post-secondary institutions.

Read

Recent Blog

Coming soon...










Get in Touch

Contact

Halifax, NS, Canada