Demo for Hydra
About Hydra
Hydra is a Python tool to manage complex configurations in your data science projects.
How to Run the Project
- Clone this repository:
git clone https://github.com/khuyentran1401/hydra_demo.git
- Install Poetry
- Set up the environment:
make setup
Introduction to Hydra
Folders
Folders shown in the video:
Short Summary
Imagine your YAML configuration file looks like this:
process:
keep_columns:
- Income
- Recency
- NumWebVisitsMonth
- Complain
- age
- total_purchases
- enrollment_years
- family_size
remove_outliers_threshold:
age: 90
Income: 600000
To access the list under process.keep_columns
in the configuration file, simple add the @hydra.main
decorator to the function that uses the configuration:
import hydra
from omegaconf import DictConfig, OmegaConf
@hydra.main(config_path="../config", config_name="main")
def process_data(config: DictConfig):
print(config.process.keep_columns)
process_data()
Output:
['Income', 'Recency', 'NumWebVisitsMonth', 'Complain', 'age', 'total_purchases', 'enrollment_years', 'family_size']
Group Configuration Files
TODO