Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
A Practical Guide to Qdrant for RAG Applications
Published:
A practical overview of A Practical Guide to Qdrant for RAG Applications, covering A Practical Guide to Qdrant for RAG Applications, Why a vector database is needed in RAG…
Python try / except / else / finally learning notes
Published:
A practical overview of Python try / except / else / finally learning notes, covering 核心概念, 基本结构, except 优先捕捉能处理的特定异常类型.
LLM/RAG Learning notes
Published:
A practical overview of LLM/RAG Learning notes, covering 主要内容, LangChain主线总结报告.
Boruta Feature Selection
Published:
A practical overview of Boruta Feature Selection, covering Purpose, How It Works, Advantages.
Kernel Density Estimation
Published:
A practical overview of Kernel Density Estimation, covering Kernel Density Estimation, Nonparametric Estimation, The univariate case (1 variable).
Fully Convolutional Networks
Published:
A summary of FCN (Fully Convolutional Networks) Fully Convolutional Networks for Semantic Segmentation GitHub 传统的卷积神经网络(CNN)通常包含卷积层和全连接层。在这篇论文中,作者认识到全连接层会导致输出固定大小的特征向量,从而限制了CNN在…
Convolutional Neural Networks
Published:
什么是卷积神经网络(CNN)?它在计算机视觉中的应用是什么? 卷积神经网络(Convolutional Neural Network,CNN)是一种深度学习模型,特别适用于处理具有网格结构的数据,如图像和视频。 CNN的核心思想是通过卷积操作和池化操作来提取输入数据的特征,并通过这些特征进行分类、识别或回归等任务。 基本原理 ref
Happy 31th birthday
Published:
今天不谈机器学习,不再整理一些记不住的知识,而是在三十又一的当下,记录一下最近的思考。 为什么最近会有时间做这么多思考呢?因为我在今年1月,农历新年前不到1个月的时候,试用期还有1周就转正的时候,突然被通知不转正了。 到现在应该失业正好差不多两个月,中间农历新年期间完全是一个招聘停滞的状态。然后在我31岁生日之前至少有一个比较过得去的口头offer,也…
Clustering alrogithms
Published:
A practical overview of Clustering alrogithms, covering Improvements for gradient descent, Momentum (惯性保持), AdaGrad (环境感知).
Activation Functions
Published:
A practical overview of Activation Functions, covering Sigmoid Family, Hard Sigmoid, Swish.
Loss Functions
Published:
A practical overview of Loss Functions, covering Regression Loss Functions, L1 Loss (Mean Absolute Error, MSE), L2 Loss (Mean Square Error, MSE).
Classification Algorithms
Published:
A practical overview of Classification Algorithms, covering Assumption, Logistic Regression, Description.
Time-series Forecasting
Published:
A practical overview of Time-series Forecasting, covering ARIMA (Autoregressive Integrated Moving Average), Description, Strengths.
Metrics to Evaluate Predictive Models
Published:
A practical overview of Metrics to Evaluate Predictive Models, covering MAE (Mean Absolute Error), Definition, Formula.
Donut: an OCR-free Document Understanding Transformer
Published:
A practical overview of Donut: an OCR-free Document Understanding Transformer, covering Encoder, Decoder, Output Conversion.
Object Detection Algorithms
Published:
A practical overview of Object Detection Algorithms, covering Important notes, Loss Function, R-CNN.
portfolio
MNIST Handwritten Classification
Machine Learning course project
Echo Cancellation
Machine Learning course project: Solve the adaptive echo calcellation problem, such that the noise is removed from music
Python-based CyTOF processing and analyzing package
Machine Learning course project: Solve the adaptive echo calcellation problem, such that the noise is removed from music
Time-series forecasting
Time-series forecasting - A winning contribution to the Siemens’ Tech For Sustainability 2023 campaign in the Swarm Behaviour on the Grid track
Advancing Plant Phenotyping with PlantCV: An Open-Source Image Analysis Software Package
February 2020 ~ September 2021
This project was centered around PlantCV (Plant Computer Vision), an open-source software package meticulously designed for plant phenotyping analysis. Developed in Python and integrating advanced image processing libraries such as OpenCV (Open Source Computer Vision Library), PlantCV aims to provide plant scientists and researchers with a powerful, flexible, and user-friendly tool for automating and quantifying the extraction of plant phenotypic information from various image data.
Enhanced Detection and Classification of Cell Nuclei in H&E Stained Pathology Images Using Mask R-CNN
October 2021 ~ December 2021
This project focused on the application of Mask R-CNN, a state-of-the-art model for instance segmentation tasks, to detect and classify common types of cell nuclei in H&E (Hematoxylin and Eosin) stained pathology images of Non-Small Cell Lung Cancer (NSCLC) and Breast Cancer. By leveraging transfer learning and customizing the loss function of Mask R-CNN, the project aimed to address the challenges posed by incomplete labeling in the dataset and improve the model’s performance in both detection and classification tasks.
Automated Prescription Parsing API development, maintain, and improvement (for the Japanese Market)
September 2023 – October 2023
This project aimed to extend the capabilities of an existing in-house developed tool, OptiReader, designed for the automatic parsing of eyeglass prescriptions. Initially supporting the North American market, the project’s goal was to adapt the tool for the Japanese market, particularly for VR eyeglasses, by incorporating innovative document understanding technologies and custom solutions to handle unique prescription formats prevalent in Japan.
Time-series forecasting for a payment processing company
May 2023
This project is actually a pre-interview project for data scientist. I have got a chance to perform data analysis and time-series modeling and forecasting.
publications
Root identification in minirhizotron imagery with multiple instance learning
Published in Machine Vision and Applications, 2020
This paper is about applying multiple instance learning for an image segmentation task (root segmentation) from minirhizotron images.
Recommended citation: "Yu, G., Zare, A., Sheng, H., Matamala, R., Reyes-Cabrera, J., Fritschi, F.B. and Juenger, T.E., 2020. Root identification in minirhizotron imagery with multiple instance learning. Machine Vision and Applications, 31, pp.1-13." /files/1903.03207.pdf
A Deep Learning Approach for Histology-Based Nuclei Segmentation and Tumor Microenvironment Characterization
Published in New Phytologist, 2023
This paper is about applying deep learning based approach for nuclei segmentation and tumor microenvironment characterization.
Recommended citation: Panda K, Mohanasundaram B, Gutierrez J, McLain L, Castillo SE, Sheng H, Casto A, Gratacós G, Chakrabarti A, Fahlgren N, Pandey S. The plant response to high CO2 levels is heritable and orchestrated by DNA methylation. New Phytologist. 2023 Jun;238(6):2427-39. https://nph.onlinelibrary.wiley.com/doi/full/10.1111/nph.18876
A Deep Learning Approach for Histology-Based Nuclei Segmentation and Tumor Microenvironment Characterization
Published in Pathology, 2023
This paper is about applying deep learning based approach for nuclei segmentation and tumor microenvironment characterization.
Recommended citation: Rong R, Sheng H, Jin KW, Wu F, Luo D, Wen Z, Tang C, Yang DM, Jia L, Amgad M, Cooper LA. A Deep Learning Approach for Histology-Based Nucleus Segmentation and Tumor Microenvironment Characterization. Modern Pathology. 2023 Aug 1;36(8):100196. https://www.sciencedirect.com/science/article/pii/S0893395223001011
Increasing the Throughput of Annotation Tasks Across Scales of Plant Phenotyping Experiments
Published in NAPPN2024, 2023
This paper is about applying deep learning based approach for nuclei segmentation and tumor microenvironment characterization.
Recommended citation: Sheng H, Gutierrez J, Schuhl H, Murphy KM, Acosta-Gamboa L, Gehan M, Fahlgren N. Increasing the Throughput of Annotation Tasks Across Scales of Plant Phenotyping Experiments. Authorea Preprints. 2023 Oct 19. https://www.techrxiv.org/doi/full/10.22541/essoar.169773045.57471797
talks
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.
