Yibo Zhang

PhD in computational chemistry, University of New Hampshire

Currently

Interesting in materials science and machine learning

Specialized in

Machine LearningDeep LearningScientific ComputingMaterials InformaticsData EngineeringCloud ComputingFull Stack Development

Professional Experience

Northeast Materials Database (NEMAD) Project
Jan. 2024 - Present
  • Led development of a scalable database with 26,000+ entries using ETL pipelines and automated data collection
  • Architected and deployed machine learning pipeline achieving 90% accuracy in multi-class classification
  • Developed ensemble models for regression tasks achieving R² of 0.86 and reducing prediction error by 40%
  • Implemented feature engineering techniques resulting in 62 high-value material discoveries
  • Built and deployed production-ready REST API and web interface (www.nemad.org) with 150+ monthly active users
PythonVue.jsFastAPISQLAWSDocker
GPTArticleExtractor: Large-Scale Data Mining Project
Jan. 2023 - Jan. 2024
  • Designed and implemented an NLP pipeline leveraging LLMs to extract structured data from 22,000+ scientific papers
  • Integrated YOLOv8 model for automated segmentation of figures and text regions in research papers
  • Achieved 83% accuracy in automated information extraction, reducing manual processing time by 90%
  • Engineered vector embedding system for semantic search, improving information retrieval accuracy by 45%
  • Developed custom data validation framework ensuring 95%+ data quality
  • Created scalable ETL pipeline processing 10000+ papers per hour
  • Built interactive dashboard for real-time data visualization and analysis
PythonLangChainOpenAI APIComputer VisionPrompt EngineeringVector DBPostgreSQL
Deep Learning for Vector Field Reconstruction
Jan. 2022 – Jan. 2023
  • Developed custom U-Net architecture with ResNet backbone for complex image-to-vector field mapping
  • Designed multi-scale CNN architecture incorporating skip connections and residual blocks
  • Implemented coordinate transformation pipeline converting polar to Cartesian coordinates for 3D reconstruction
  • Designed and optimized bilinear interpolation algorithms for accurate vector field projection
  • Built data processing system handling 10TB+ of simulation data using distributed computing
  • Achieved 92% accuracy in vector field prediction using custom loss functions
  • Reduced computation time by 75% through GPU optimization and parallel processing
  • Created automated testing framework for model validation
PyTorchCUDANumPyOpenCVSciPyLinuxCNNsResNet
Job Comparison Android Application
Jan. 2023 - May. 2023
  • Architected and developed Android application for comparing job offers using Java and Android Studio
  • Implemented systematic design approach using UML diagrams for initial architecture planning
  • Designed and built intuitive user interface following Material Design principles
  • Established Git workflow with feature branching strategy for team collaboration
  • Created comprehensive testing suite including unit and integration tests
  • Led weekly team meetings and coordinated development efforts through GitHub
  • Managed feature implementation through agile development process
JavaAndroid StudioGitUMLMaterial DesignJUnit
Large-Scale Computational Analysis Platform
Nov. 2018 - Oct. 2020
  • Built automated workflow for high-throughput computational chemistry calculations
  • Developed Python scripts for data preprocessing and feature extraction
  • Created interactive visualization dashboard for real-time analysis
  • Implemented version control and documentation system for reproducible research
  • Reduced analysis time by 80% through process automation
PythonLinuxGitJupyterMatplotlib
Brazilian E-Commerce Analysis Project
Aug. 2023 - Nov. 2023
  • Conducted comprehensive analysis of Olist dataset focusing on customer satisfaction and purchasing patterns
  • Developed predictive models using regression analysis and sentiment analysis techniques
  • Created interactive dashboards and visualizations using Tableau, D3.js, and Matplotlib
  • Performed geographic distribution analysis to identify regional sales patterns and opportunities
  • Engineered recommendation system algorithm improving customer product discovery
  • Implemented data processing pipeline handling 100,000+ transaction records
  • Built interactive web-based visualization platform for real-time data exploration
Python (pandas)SQLiteTableauD3.jsJavaScriptMachine Learning

Work Experience

Research Assistant
Aug. 2019 - Present
University of New Hampshire
Teaching Assistant
Aug. 2018 – Aug. 2019
University of New Hampshire
Research Assistant
Oct. 2015 - Jan. 2017
Stony Brook University

Technical Skills

Machine Learning/AI:
TensorFlowPyTorchScikit-learnDeep LearningNeural NetworksGANsLangChainVector DatabasesOpenAIOllama
Data Processing/Analysis:
Python (Numpy, Pandas)SQLPySparkBig Data AnalyticsETL PipelinesDatabricksApache Airflow
Cloud Computing:
AWSGCPDockerKubernetesTerraform
Data Visualization:
D3.jsMatplotlibTableauPlotlyStreamlit
Development Tools:
GitLinuxUbuntuNixOSBashCI/CD
Network & Security:
SSHSSL/TLSNetwork Protocols
Programming Languages:
PythonSQLJavaScriptREST APIsFastAPI
Database Systems:
PostgreSQLMongoDBRedis
Computer Vision & Image Processing:
OpenCVImage reconstructionCoordinate transformationsBilinear interpolation3D visualization

Education

Georgia Institute of Technology, Online

Computer Science - Master's Degree

Jan. 2022-Now

University of New Hampshire, Durham

Computational Chemistry - PhD's Degree

Sep. 2018-2024

Stony Brook University, New York

Materials Science and Engineering - Master's Degree

Aug. 2015-May 2017

Zhengzhou University, China

Materials Chemistry, College of Materials Sciences and Engineering - Bachelor's Degree

Sep. 2011-Jul. 2015

Publications

Large Language Model-Driven Database for Thermoelectric Materials

Itani, S., Zhang, Y., & Zang, J.

arXiv preprint arXiv:2501.00564 (2024)

Northeast Materials Database (NEMAD): Enabling Discovery of High Transition Temperature Magnetic Compounds

Itani, S., Zhang, Y., & Zang, J.

arXiv preprint arXiv:2409.15675 (2024)

GPTArticleExtractor: An automated workflow for magnetic material database construction

Zhang, Y., Itani, S., Khanal, K., Okyere, E., Smith, G., Takahashi, K., & Zang, J.

Journal of Magnetism and Magnetic Materials, 597, 172001 (2024)

Three-dimensional magnetization reconstruction from electron optical phase images with physical constraints

Lyu, B., Zhao, S., Zhang, Y., Wang, W., Zheng, F., Dunin-Borkowski, R. E., Zang, J., & Du, H.

Science China Physics, Mechanics & Astronomy, 67(11), 1-11 (2024)

MagNet: machine learning enhanced three-dimensional magnetic reconstruction

Lyu, B., Zhao, S., Zhang, Y., Wang, W., Du, H., & Zang, J.

arXiv preprint arXiv:2210.03066 (2022)

Conferences

2025 Joint MMM-Intermag Conference, New Orleans

January 13-17, 2025

Poster presentation: "Comprehensive Database of Magnetic Materials Using AI-Driven Methodologies"

IEEE AtC-AtG Magnetics Conference 2024

October 2, 2024

Oral presentation: "Comprehensive Database of Magnetic Materials Using AI-Driven Methodologies"

2022 Fall meeting of the New England sections (NES) of APS

October 14, 2022

Poster presentation: "MagNet: machine learning enhanced three-dimensional magnetic reconstruction"

Back to Home