Building Virtual Assistants with Rasa 2022-07
Rasa is an open source framework with state-of-the-art NLU research to easily build conversational virtual assistants.
Keywords on NLU reseach: Intent Classifiers, Entity Extractors, Dialogue Policies
Open-source tools: Rasa Open Source, Rasa Action Server, Rasa-X
Computer Vision Models Life-Cycle Management 2021-12
Easy and efficient workflows for computer vision projects (e.g. image classification, object detection, segmentation, etc.) via open-source tools like CVAT for annotation, OpenMMLab for modeling, Fiftyone for curating data and and improving computer vision datasets and models.
Open-source tools: Fiftyone (Voxel51), CVAT, OpenMMLab
Scene Text Recognition 2021-06
A technical report/research on Scence Text Detection and Recognition. Papers coverd in the slides:
- Scene Text Recognition:
    - CRNN (CNN + BiLSTM + CTC loss)
- ASTER (Spatial Transformer Networks + CNN + BiLSTM + Attention-Based Decoder)
- MORAN (Pixel-Level Rectification + CNN + BiLSTM + Attention-Based Decoder)
- TextScanner (Segmentation-Based with Mutual-Supervision Mechanism)
 
- Scene Text Detection:
    - DB-Net: Differentiable Binarization (use segmentation-based methods for detection)
 
- OCR System
    - 
        PP-OCR: A Practical Ultra Lightweight OCR System from Baidu Inc. Main idea: 1 Detector + 1 Text Box Rectifier + N Recognizer (for recognizing multiple languages). 
 
- 
        
Keywords: OCR, Scene Text Detection, Scene Text Recognition.
An Easy ML flow 2020-02
An easy end-to-end ML flow includes loading data, processing data, tuning hyper-parameters of a selected algorithm, comparing different models’ performance, and apply the best model to get prediction results.
Optimization tools: sklearn-optimize
Geospatial Data Challenge 2018-09
A data science challenge related to handling geospatial data with shapefiles.
Geospatial data tools: GeoPandas, Shapely, Rasterio (for raster data).
NYC Green Taxi Data Challenge 2018-06
A data science challenge related to Green Taxi in New York City.
Shiny Applications 2018-05
Shiny applications developed during 2017-2018 as a research assistant working for Prof. Voeten at Georgetown University.
Face recognition with Eigenfaces 2017-11
Use PCA to do dimension reduction on 400 face images (each one has 10304 pixels) to find the top k eigenfaces. After compressing images (dimension reduction), use euclidian distance to find the most similar face image of a given face image in the database.