There is more than learning the language syntax and features. Now you need to learn about Windows(R), and how to get the info you need / manipulate things. For example, you could make a pixel bot, or a ReadProcessMemory bot or even injected. To click on a specific text is hard because that generally required "word / letter recognition which isn't 100% straight forward..it's complicated depending on font etc. There is more than 1 way to do most tasks: you have to choose which is the best for your project.
If you knew which tutorials to read, maybe 2-3 months before you understand windows well enough / what you actually want to do (as far as read from a process, a file, what is a process, codecaves, etc etc). All depends how much/what you read. Also depends what you mean by "bot" ... auto-clickers can be considered bots, which is relatively simple: All depends. Big projects may use several methods (pixel + mem read + read log files, etc)
-As for Unity, idk, sorry.