Python: Downloading data from the web
Are you tired of going to the browser, downloading the data you want, and then saving it to your desired folder? Well, here is your solution! You can download the data from the web using Python! Let everything be automated!
Let's get started! The data used for this tutorial was downloaded from the following source: https://github.com/owid/covid-19-data/blob/master/public/data/vaccinations/vaccinations.csv.
After you have searched your file on the web (it can be any file from any web), the first thing you should do is to right-click on the file and copy its link address as shown in the figure below.
#Importing libraries
from urllib import request
#Reading the file from the link
file_url = r'https://github.com/owid/covid-19-data/blob/master/public/data/vaccinations/vaccinations.csv?raw=true'
The letter r in the code stands for reading mode. Note that the link address should be inside the quotation marks (' ').
Now, we will get the file downloaded line by line and saved in a text file (which is not yet created). For this purpose, we will define a function for doing this, and then, at the end, we should call this function in order to get the data.
#Defining a function to download the file
def file_info(url):
#Opening the url file
file_open = request.urlopen(url)
#Reading the file
file_content = file_open.read()
#Converting into string
content = str(file_content)
#Splitting the lines
lines = content.split('\\n')
Notice that the function's name is file_info, and its input is called url, which can be differently named, as you prefer. However, if you do so, do not forget to change the corresponding names in the upcoming code lines!
Once the function is defined, the first thing we should do is to open the file from the web. For this, the function request.urlopen is needed. Then, in order for Python to go through the whole file and read it, the function read is needed.
The opened file by Python is in the bit format, which is a complex format to work with. Thus, the need to convert it to a string format arises. After doing so, we must split the lines of the file, otherwise, the whole content of the file will be in one long line.
Now that Python is able to read the file from the web, we will save it as a new file in the same directory as our Python script file. For this purpose, we just need 4 lines of code!
with open('vaccinations.txt', 'w') as output_file:
for line in lines:
save_data = output_file.write(line + '\n')
print(save_data)
Python has the possibility to 'open' a file that does not exist in a write mode. The write mode 'w' means that the text file Python just created is ready to be written. In the first line, output_file is the name of the variable. It is similar to this:
output_file = open('vaccinations.txt', 'w')
Then, the second line of the code is used to go through the lines variable, which contains the content of our web file. Once Python has read all the lines of the web file, it will copy and paste it into the created text file using the write function, and then save it. As already explained before, the keyword '\n' is used to split the lines.
Once we got the text file created with the content from the web file, we just need to call our function.
#Calling the function
file_info(file_url)
If we run this code, the text file created by Python will be found in the same folder as your Python script.
The final code will look like this:
Congratulations! Now you can surprise your programming teacher by downloading any file from any web automatically! In the next tutorial, you will learn how to manipulate huge amount of data!
Comments
Post a Comment