Browse Source

Update of project documentation

Jérôme BUISINE 10 months ago
parent
commit
eac3bc12d9

+ 56 - 16
README.md

@@ -23,32 +23,76 @@ Generate all needed data for each features (which requires the the whole dataset
 python generate/generate_all_data.py --feature all
 ```
 
-## How to use
+## Project structure
 
-### Multiple directories and scripts are available:
+### Link to your dataset
+
+You have to create a symbolic link to your own database which respects this structure:
+
+- dataset/
+  - Scene1/
+    - zone00/
+    - ...
+    - zone15/
+      - seuilExpe (file which contains threshold samples of zone image perceived by human)
+    - Scene1_00050.png
+    - Scene1_00070.png
+    - ...
+    - Scene1_01180.png
+    - Scene1_01200.png
+  - Scene2/
+    - ...
+  - ...
+
+Create your symbolic link:
+
+```
+ln -s /path/to/your/data dataset
+```
+
+### Code architecture description
 
-- **dataset/\***: all scene files information (zones of each scene, SVD descriptor files information and so on...).
-- **train_model.py**: script which is used to run specific model available.
-- **data/\***: folder which will contain all *.train* & *.test* files in order to train model.
-- **saved_models/*.joblib**: all scikit learn models saved.
-- **models_info/***: all markdown files generated to get quick information about model performance and prediction. This folder contains also **model_comparisons.csv** obtained after running runAll_maxwell.sh script.
 - **modules/\***: contains all modules usefull for the whole project (such as configuration variables)
+- **analysis/\***: contains all jupyter notebook used for analysis during thesis
+- **generate/\***: contains python scripts for generate data from scenes (described later)
+- **data_processing/\***: all python scripts for generate custom dataset for models
+- **prediction/\***: all python scripts for predict new threshold from computed models
+- **simulation/\***: contains all bash scripts used for run simulation from models
+- **display/\***: contains all python scripts used for display Scene information (such as Singular values...)
+- **run/\***: bash scripts to run few step at once : 
+  - generate custom dataset
+  - train model
+  - keep model performance
+  - run simulation (if necessary)
+- **others/\***: folders which contains others scripts such as script for getting performance of model on specific scene and write it into Mardown file.
+- **data_attributes.py**: files which contains all extracted features implementation from an image.
+- **custom_config.py**: override the main configuration project of `modules/config/global_config.py`
+- **train_model.py**: script which is used to run specific model available.
+
+### Generated data directories:
+
+- **data/\***: folder which will contain all generated *.train* & *.test* files in order to train model.
+- **saved_models/\***: all scikit learn or keras models saved.
+- **models_info/\***: all markdown files generated to get quick information about model performance and prediction.obtained after running runAll_maxwell.sh script.
+- **results/**:  This folder contains `model_comparisons.csv` file used for store models performance.
+
 
+## How to use ?
 
 **Remark**: Note here that all python script have *--help* command.
 
 ```
 python generate_data_model.py --help
-
-python generate_data_model.py --output xxxx --interval 0,20  --kind svdne --scenes "A, B, D" --zones "0, 1, 2" --percent 0.7 --sep: --rowindex 1 --custom custom_min_max_filename
 ```
 
 Parameters explained:
-- **output**: filename of data (which will be split into two parts, *.train* and *.test* relative to your choices).
+- **feature**: feature choice wished
+- **output**: filename of data (which will be split into two parts, *.train* and *.test* relative to your choices). Need to be into `data` folder.
 - **interval**: the interval of data you want to use from SVD vector.
 - **kind**: kind of data ['svd', 'svdn', 'svdne']; not normalize, normalize vector only and normalize together.
 - **scenes**: scenes choice for training dataset.
 - **zones**: zones to take for training dataset.
+- **step**: specify if all pictures are used or not using step process.
 - **percent**: percent of data amount of zone to take (choose randomly) of zone
 - **custom**: specify if you want your data normalized using interval and not the whole singular values vector. If it is, the value of this parameter is the output filename which will store the min and max value found. This file will be usefull later to make prediction with model (optional parameter).
 
@@ -78,8 +122,8 @@ The model will return only 0 or 1:
 - 0 means image seem to be not noisy.
 
 All SVD features developed need:
-- Name added into *feature_choices_labels* global array variable of **modules/utils/config.py** file.
-- A specification of how you compute the feature into *get_svd_data* method of **modules/utils/data_type.py** file.
+- Name added into *feature_choices_labels* global array variable of `custom_config.py` file.
+- A specification of how you compute the feature into *get_image_features* method of `data_attributes.py` file.
 
 ### Predict scene using model
 
@@ -102,10 +146,6 @@ All scripts named **prediction/predict_seuil_expe\*.py** are used to simulate mo
 
 Once you have simulation done. Checkout your **threshold_map/%MODEL_NAME%/simulation\_curves\_zones\_\*/** folder and use it with help of **display_simulation_curves.py** script.
 
-### Others...
-
-All others bash scripts are used to combine and run multiple model combinations...
-
 ## License
 
 [The MIT license](https://github.com/prise-3d/Thesis-NoiseDetection-26-attributes/blob/master/LICENSE)

+ 1 - 1
data_attributes.py

@@ -23,7 +23,7 @@ import custom_config as cfg
 from modules.utils import data as dt
 
 
-def get_svd_data(data_type, block):
+def get_image_features(data_type, block):
     """
     Method which returns the data type expected
     """

+ 1 - 1
display/display_simulation_curves.py

@@ -4,7 +4,7 @@ import pandas as pd
 import matplotlib.pyplot as plt
 import os, sys, argparse
 
-from modules.utils.data import get_svd_data
+from modules.utils.data import get_image_features
 
 from modules.utils import config as cfg
 

+ 2 - 2
generate/generate_all_data.py

@@ -16,7 +16,7 @@ sys.path.insert(0, '') # trick to enable import of main folder module
 
 import custom_config as cfg
 from modules.utils import data as dt
-from data_attributes import get_svd_data
+from data_attributes import get_image_features
 
 
 # getting configuration information
@@ -99,7 +99,7 @@ def generate_data_svd(data_type, mode):
                 # feature computation part #
                 ###########################
 
-                data = get_svd_data(data_type, block)
+                data = get_image_features(data_type, block)
 
                 ##################
                 # Data mode part #

+ 1 - 1
generate/generate_data_model.py

@@ -14,7 +14,7 @@ sys.path.insert(0, '') # trick to enable import of main folder module
 
 import custom_config as cfg
 from modules.utils import data as dt
-from data_attributes import get_svd_data
+from data_attributes import get_image_features
 
 
 # getting configuration information

+ 1 - 1
generate/generate_data_model_random.py

@@ -14,7 +14,7 @@ sys.path.insert(0, '') # trick to enable import of main folder module
 
 import custom_config as cfg
 from modules.utils import data as dt
-from data_attributes import get_svd_data
+from data_attributes import get_image_features
 
 
 # getting configuration information

+ 1 - 1
generate/generate_data_model_random_center.py

@@ -14,7 +14,7 @@ sys.path.insert(0, '') # trick to enable import of main folder module
 
 import custom_config as cfg
 from modules.utils import data as dt
-from data_attributes import get_svd_data
+from data_attributes import get_image_features
 
 
 # getting configuration information

+ 1 - 1
generate/generate_data_model_random_split.py

@@ -14,7 +14,7 @@ sys.path.insert(0, '') # trick to enable import of main folder module
 
 import custom_config as cfg
 from modules.utils import data as dt
-from data_attributes import get_svd_data
+from data_attributes import get_image_features
 
 
 # getting configuration information

+ 5 - 5
others/testModelByScene.sh

@@ -39,12 +39,12 @@ INPUT_BEGIN=$1
 INPUT_END=$2
 INPUT_MODEL=$3
 INPUT_MODE=$4
-INPUT_feature=$5
+INPUT_FEATURE=$5
 
 zones="0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15"
 
 echo "**Model :** ${INPUT_MODEL}"
-echo "**feature :** ${INPUT_feature}"
+echo "**feature :** ${INPUT_FEATURE}"
 echo "**Mode :** ${INPUT_MODE}"
 echo "**Vector range :** [${INPUT_BEGIN}, ${INPUT_END}]"
 echo ""
@@ -53,10 +53,10 @@ echo "---|--------|-------|----------"
 
 for scene in {"A","B","C","D","E","F","G","H","I"}; do
 
-  FILENAME="data/data_${INPUT_MODE}_${INPUT_feature}_B${INPUT_BEGIN}_E${INPUT_END}_scene${scene}"
+  FILENAME="data/data_${INPUT_MODE}_${INPUT_FEATURE}_B${INPUT_BEGIN}_E${INPUT_END}_scene${scene}"
 
-  python generate/generate_data_model.py --output ${FILENAME} --interval "${INPUT_BEGIN},${INPUT_END}" --kind ${INPUT_MODE} --feature ${INPUT_feature} --scenes "${scene}" --zones "${zones}" --percent 1 --sep ";" --rowindex "0"
+  python generate/generate_data_model.py --output ${FILENAME} --interval "${INPUT_BEGIN},${INPUT_END}" --kind ${INPUT_MODE} --feature ${INPUT_FEATURE} --scenes "${scene}" --zones "${zones}" --percent 1 --sep ";" --rowindex "0"
 
-  python prediction/prediction_scene.py --data "$FILENAME.train" --model ${INPUT_MODEL} --output "${INPUT_MODEL}_Scene${scene}_mode_${INPUT_MODE}_feature_${INPUT_feature}.prediction" --scene ${scene}
+  python prediction/prediction_scene.py --data "$FILENAME.train" --model ${INPUT_MODEL} --output "${INPUT_MODEL}_Scene${scene}_mode_${INPUT_MODE}_feature_${INPUT_FEATURE}.prediction" --scene ${scene}
 
 done

+ 2 - 2
prediction/predict_noisy_image_svd.py

@@ -14,7 +14,7 @@ from PIL import Image
 sys.path.insert(0, '') # trick to enable import of main folder module
 
 import custom_config as cfg
-from data_attributes import get_svd_data
+from data_attributes import get_image_features
 
 # variables and parameters
 path                  = cfg.dataset_path
@@ -79,7 +79,7 @@ def main():
     # load image
     img = Image.open(p_img_file)
 
-    data = get_svd_data(p_feature, img)
+    data = get_image_features(p_feature, img)
 
     # get interval values
     begin, end = p_interval

+ 1 - 1
run/runAll_maxwell.sh

@@ -8,7 +8,7 @@ erased=$1
 if [ "${erased}" == "Y" ]; then
     echo "Previous data file erased..."
     rm ${file_path}
-    mkdir -p models_info
+    mkdir -p results
     touch ${file_path}
 
     # add of header

+ 1 - 1
run/runAll_maxwell_custom.sh

@@ -8,7 +8,7 @@ erased=$1
 if [ "${erased}" == "Y" ]; then
     echo "Previous data file erased..."
     rm ${file_path}
-    mkdir -p models_info
+    mkdir -p results
     touch ${file_path}
 
     # add of header

+ 1 - 1
run/runAll_maxwell_custom_center.sh

@@ -8,7 +8,7 @@ erased=$1
 if [ "${erased}" == "Y" ]; then
     echo "Previous data file erased..."
     rm ${file_path}
-    mkdir -p models_info
+    mkdir -p results
     touch ${file_path}
 
     # add of header

+ 1 - 1
run/runAll_maxwell_custom_split.sh

@@ -8,7 +8,7 @@ erased=$1
 if [ "${erased}" == "Y" ]; then
     echo "Previous data file erased..."
     rm ${file_path}
-    mkdir -p models_info
+    mkdir -p results
     touch ${file_path}
 
     # add of header