data mining code


Install
pip install datamining==0.2.2

Documentation

A data mining software for supply chain management

This project is a online client-server model for supply chain management.

About

This project is Zhe Liu's thesis of my bachelor's degree for Zhejiang University.
Any use without prior notice is not allowed.

How to deploy this application

Deploy locally with docker:

  1. Clone this repo, switch to docker branch
  2. Install docker
  3. Run docker build -t data-mining .
  4. Run docker-compose up -d --build
  5. Go to localhost:8000, all done

Configure Python environment locally:

  1. Clone this repo, switch to master branch
  2. Install python3 first
  3. Install all dependencies in requirement.txt
  4. Run python manage.py runserver
  5. Go to localhost:8000, all done

Deploy on server:

  1. Clone this repo, switch to server branch
  2. Configure nginx uwsgi --http :8000 --chdir /root/dataMining/ -w djangoData.wsgi
  3. Configure path and allowed host

-STATIC_URL = '/static/' +STATIC_URL = '/polls/static/' +STATIC_ROOT = '/root/dataMining/polls/static/'

  1. Follow the same requirements as Configure Python environment locally
  2. Go to 'ServerIP:8000', all done

Road Map

Version

  • 0.1 CSV preview and nav bars, writing templates for home page and all sub-pages
  • 0.2 Classification tempalte and base logic set up
  • 0.3 Finished Classification
  • 0.4 Finished documention tempaltes and documents for Classification
  • 0.5 Clustering tempalte and base logic set up
  • 0.6 Finished Clustering
  • 0.7 Finished documention for clustering
  • 0.8 Deploy this software to server
  • 0.9 Finished Aporiori based association rules, finished upload and downloading functionalities
  • 1.0 Adding detailed documentation and all functionalities for parameters adjustment

Functionalities

General

  • Show all data uploaded
  • File upload and download
  • Using Ajax to dynamically change HTML element.
  • Using Django tempaltes for all types of demand
  • Configure Django URL config different functionalities
  • Using OOP for Django views
  • More data fomat support: xls
  • More data fomat support: text file

Data preprocess

  • Missing data handling
  • More advanced Missing data handling(fix missing data automatically)
  • char to digit tranformation

Clustering

  • Clustering: KMeans
  • Clustering:Mini Batch KMeans
  • Clustering:Affinity Propagation
  • Clustering:Mean Shift
  • Clustering:Spectral Clustering
  • Clustering:Agglomerative Clustering
  • Clustering:DBSCAN
  • Clustering:Birch
  • Documentation for Clustering
  • Parameters Adjustment for Clustering

Classification

  • Classification:Logistic Regression
  • Classification:KNeighbors Classifier
  • Classification:SVC
  • Classification:GradientBoosting Classifier
  • Classification:DecisionTree Classifier
  • Classification:Random Forest Classifier
  • Classification:MLP Classifier
  • Classification:Gaussian Naive Bayes
  • Documentation for classification
  • Parameters Adjustment for Classification

Association rules

  • Apriori algorithm
  • Parameters for Apriori algorithm
  • Full documentation for Apriori algorithm
  • More association rules algorithm