Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
Boruta Feature Selection
Published:
Purpose
Boruta is designed to determine which variables (features) are significant in predicting the outcome with the given dataset. It is particularly useful when dealing with high-dimensional data.
Kernel Density Estimation
Published:
It has been a while since I last took some learning notes - I have been buried in work and trying to figure out a balance, and wasting my time …
Financial Credit Risk Management
Published:
信贷业务模型维度
Fully Convolutional Networks
Published:
A summary of FCN (Fully Convolutional Networks)
Convolutional Neural Networks
Published:
什么是卷积神经网络(CNN)?它在计算机视觉中的应用是什么?
Basic pandas
Published:
Python, data science, pandas
Basic numpy
Published:
Python, data science, numpy
Happy 31th birthday
Published:
今天不谈机器学习,不再整理一些记不住的知识,而是在三十又一的当下,记录一下最近的思考。
Clustering alrogithms
Published:
gradient descent, stochastic gradient descent, batch gradient descent
Activation Functions
Published:
激活函数(Activation Function),负责将神经元的输入映射到输出端,激活函数将神经网络中将输入信号的总和转换为输出信号。激活函数大多是非线性函数,才能将多层感知机的输出转换为非线性,使得神经网络可以任意逼近任何非线性函数,进而可以应用到众多的非线性模型中。
Loss Functions
Published:
Regression Loss Functions
Classification Algorithms
Published:
Classification Algorithms
Time-series Forecasting
Published:
ARIMA (Autoregressive Integrated Moving Average)
It is a popular statistical analysis model used for forecasting time series data. ARIMA models are especially well-suited for short to medium-term forecasting models that have data with trends, seasonality, or cyclic patterns. The model aims to describe the autocorrelations in the data.
Clustering alrogithms
Published:
The importance of clustering are of
Metrics to Evaluate Predictive Models
Published:
Metrics used to evaluate predictive modeling, highly used in regression, time-series forecasting cases.
Donut: an OCR-free Document Understanding Transformer
Published:
My notes on this paper: OCR-free Document Understanding Transformer (github repo).
Object Detection Algorithms
Published:
A summary of object detection algorithms
portfolio
MNIST Handwritten Classification
Published:
Machine Learning course project
Echo Cancellation
Published:
Machine Learning course project: Solve the adaptive echo calcellation problem, such that the noise is removed from music
Switchgrass Genotype Classification using Hyperspectral Imagery
Published:
Master Thesis
Python-based CyTOF processing and analyzing package
Published:
Machine Learning course project: Solve the adaptive echo calcellation problem, such that the noise is removed from music
Time-series forecasting
Published:
Time-series forecasting - A winning contribution to the Siemens’ Tech For Sustainability 2023 campaign in the Swarm Behaviour on the Grid track
Advancing Plant Phenotyping with PlantCV: An Open-Source Image Analysis Software Package
Published:
February 2020 ~ September 2021
This project was centered around PlantCV (Plant Computer Vision), an open-source software package meticulously designed for plant phenotyping analysis. Developed in Python and integrating advanced image processing libraries such as OpenCV (Open Source Computer Vision Library), PlantCV aims to provide plant scientists and researchers with a powerful, flexible, and user-friendly tool for automating and quantifying the extraction of plant phenotypic information from various image data.
Enhanced Detection and Classification of Cell Nuclei in H&E Stained Pathology Images Using Mask R-CNN
Published:
October 2021 ~ December 2021
This project focused on the application of Mask R-CNN, a state-of-the-art model for instance segmentation tasks, to detect and classify common types of cell nuclei in H&E (Hematoxylin and Eosin) stained pathology images of Non-Small Cell Lung Cancer (NSCLC) and Breast Cancer. By leveraging transfer learning and customizing the loss function of Mask R-CNN, the project aimed to address the challenges posed by incomplete labeling in the dataset and improve the model’s performance in both detection and classification tasks.
Automated Prescription Parsing API development, maintain, and improvement (for the Japanese Market)
Published:
September 2023 – October 2023
This project aimed to extend the capabilities of an existing in-house developed tool, OptiReader, designed for the automatic parsing of eyeglass prescriptions. Initially supporting the North American market, the project’s goal was to adapt the tool for the Japanese market, particularly for VR eyeglasses, by incorporating innovative document understanding technologies and custom solutions to handle unique prescription formats prevalent in Japan.
Time-series forecasting for a payment processing company
Published:
May 2023
This project is actually a pre-interview project for data scientist. I have got a chance to perform data analysis and time-series modeling and forecasting.
publications
Root identification in minirhizotron imagery with multiple instance learning
Published in Machine Vision and Applications, 2020
This paper is about applying multiple instance learning for an image segmentation task (root segmentation) from minirhizotron images.
Recommended citation: "Yu, G., Zare, A., Sheng, H., Matamala, R., Reyes-Cabrera, J., Fritschi, F.B. and Juenger, T.E., 2020. Root identification in minirhizotron imagery with multiple instance learning. Machine Vision and Applications, 31, pp.1-13." /files/1903.03207.pdf
A Deep Learning Approach for Histology-Based Nuclei Segmentation and Tumor Microenvironment Characterization
Published in New Phytologist, 2023
This paper is about applying deep learning based approach for nuclei segmentation and tumor microenvironment characterization.
Recommended citation: Panda K, Mohanasundaram B, Gutierrez J, McLain L, Castillo SE, Sheng H, Casto A, Gratacós G, Chakrabarti A, Fahlgren N, Pandey S. The plant response to high CO2 levels is heritable and orchestrated by DNA methylation. New Phytologist. 2023 Jun;238(6):2427-39. https://nph.onlinelibrary.wiley.com/doi/full/10.1111/nph.18876
A Deep Learning Approach for Histology-Based Nuclei Segmentation and Tumor Microenvironment Characterization
Published in Pathology, 2023
This paper is about applying deep learning based approach for nuclei segmentation and tumor microenvironment characterization.
Recommended citation: Rong R, Sheng H, Jin KW, Wu F, Luo D, Wen Z, Tang C, Yang DM, Jia L, Amgad M, Cooper LA. A Deep Learning Approach for Histology-Based Nucleus Segmentation and Tumor Microenvironment Characterization. Modern Pathology. 2023 Aug 1;36(8):100196. https://www.sciencedirect.com/science/article/pii/S0893395223001011
Increasing the Throughput of Annotation Tasks Across Scales of Plant Phenotyping Experiments
Published in NAPPN2024, 2023
This paper is about applying deep learning based approach for nuclei segmentation and tumor microenvironment characterization.
Recommended citation: Sheng H, Gutierrez J, Schuhl H, Murphy KM, Acosta-Gamboa L, Gehan M, Fahlgren N. Increasing the Throughput of Annotation Tasks Across Scales of Plant Phenotyping Experiments. Authorea Preprints. 2023 Oct 19. https://www.techrxiv.org/doi/full/10.22541/essoar.169773045.57471797
talks
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.