This is a bot to help collect data for any machine learning project.
It was developed using the python-telegram-bot
Usage & steps
- download the repo, and install python-telegram-bot:
git clone https://github.com/diesilveira/labeller_img_python_telegram_BOT.git cd labeller_img_python_telegram_BOT pip install python-telegram-bot --upgrade
- Create a configuration file in the same folder as main.py with the name conf.py copy and paste the following text:
TOKEN: str = 'YOUR TOKEN' #Like D:/Descargas/cleanAndDirtyImages PATH_FOLDER: str = 'YOUR PATH' LOCAL = 'false' BUTTONS = ["BUTTON1", "BUTTON2", "BUTTON3", "BUTTON4"] QUESTION = 'QUESTION TO THE PEOPLE ABOUT THE IMAGE?' CHOSE = 'Chose: ' GREETING = ' Welcome and thanks for your help!'
In TOKEN you must copy and paste the token of your bot, previously created.
You can see how to at: How do I create a bot? – telegram or following these steps:
For create a bot with telegram and get your TOKEN:
- send /newbot to BotFather from your telegram
- then you must to set the name, shortname and description(optional)
- botFather send you your TOKEN.
In PATH_FOLDER you must put the path of the folder that contains your set images (for run local with images in your pc)
If you want to use images from the web you must set LOCAL = ‘false’, create a file named “url_images.txt” in the same folder of the main.py that contains the name of the image, and the url separated by “;”. importan: the url link provided must be a direct link to the file!
In buttons, the name of the buttons, or labels for the image you sent. You can put all the buttons you want.
In question, the question that you will send to next to image.
- Last, RUN the project and voila, the bot is running now.
In the log.txt file the names of the images will be saved with their respective labels and in the finished file the images that have already been labeled so as not to label them twice
The buttons will be shown in two columns, in case the number of buttons is odd, the first one will occupy the entire row
It arose as a response to one of the great problems with image sets, we do not know which is which, or how to receive feedback from other people about the images we have in order to better label them.
My Own bot
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Next features in order of importance:
- auto-delete cell images after tag
- correct label when you failed
- improve welcome and help message
- implement internal buffer so that the images go faster
- login users who tag
- optional “skip image” button
- log who tagged
- verify that all buttons are different
- support that only a closed group of users can tag
- admit N labels for the same image, by N different people (that is, instead of a single person saying if it is a clean or dirty container, let N people do it, with N configurable)