VLOOKUP from Excel using Python and Streamlit

few hours with…
7 min readFeb 7, 2023

--

Updated 2024–11. When revisited this project after a long time I noticed some stuff that was not very flexible and easy to use, so I have refactored the code and partly updated this post with some better features.

While working on other projects, I accidentally saw the Streamlit framework and it blew my mind. It is an open-sourced framework that helps build web apps in seconds. Before that, I was using Flask library which is also quite easy to create web apps, but compared to Streamlit it takes much more time. In Streamlit you don't need to think about design, templates, and other front-end stuff, it is already created for you, and for back-end developers, it looks more than nice. And the best part is, you can upload your project on GitHub and use Streamlit servers to push your project online. You will have not the prettiest web address, but if you work with personal projects or projects for small groups of people, I don't think it's a big problem.

In this tutorial, I will show you, how you can create a Streamlit project and push it on the web. As an example, I will use the VLOOKUP function from Excel. Excel is a powerful tool with lots and lots of options, but the majority of people are using probably 1%–2% of Excel power. I wanted to check if we can extract just some functions that we really need for our work and create them in our own reusable environment. I chose VLOOKUP as the first function because my work colleagues have asked me to find info between multiple data a lot of times, so I created a Streamlit web app so they can easily do it by themselves. It is a little bit upgraded version of VLOOKUP, but the main functionality is the same.

The plan is to create a Streamlit web app where we can upload Excel files or templates and the web app will find all the necessary data and connect them together. In the end, we will be able to download a new Excel file with the results.

I will use Visual Studio Code for this project, but you can freely use other IDEs. Let's import the necessary libraries. We will need only three

import streamlit as st
import pandas as pd
from io import BytesIO

To understand how Streamlit works, let's just add simple text and open the result in the browser. Add the following code:

st.write("Look up values")

Next, open the terminal and navigate to the project, if you use Visual Studio Code then you can open the Visual Studio Code terminal and write the command:

streamlit run main.py

If you are in the right place in the terminal, it will open automatically in the browser. It should look like this:

In the top right corner, you have some additional options. When you change something in the code, there will be an open possibility to rerun the Streamlit page or you can press R as a shortcut.

All page consists of some kind of blocks that are built in the Streamlit framework, like inputs, buttons, sliders, etc. You can find all of them by googling “streamlit widgets”.

We will use two Excel files — the first will have to look up a table with all values and the second will have lookup values (values we want to find):

Example Excel files can be found in the “example_data” folder inside the GitHub repo.

So the main task is to take “lookup_values” and pair it with values from the lookup table. To make it more fun, we will have the possibility to choose which column we need to pair. I will not write completely full code in this post, but I will show the main components to make it work, so you can modify code according to your specific needs.

Now we have some example files so we can start to create code. Let's add to the code widget for uploading files and save the result in the variable. And also for this, we will use additional streamlit functionality which allows us to split the screen vertically into multiple columns, in this case, we will make two columns. On the left side we will have an uploader for the lookup table and on the right for lookup values:

col_1, col_2 = st.columns(2)
with col_1:
uploaded_lookup_table = st.file_uploader("Upload Lookup Table", type=["xlsx"])
with col_2:
uploaded_lookup_values = st.file_uploader("Upload Lookup Values", type=["xlsx"])

Now we can check if the file exists and if it exists let's save data from both Excels in variables, but if don't we can return a warning on screen:

if not (uploaded_lookup_table and uploaded_lookup_values):
st.warning("Please upload both files to proceed.")
st.stop()

df_lookup_table = pd.read_excel(uploaded_lookup_table)
df_value_table = pd.read_excel(uploaded_lookup_values)

To see that everything works we can show on screen both excels. As this information could take a lot of space, we can hide it in streamlit widget called expander. Expander allows us to hide stuff we don't need to see all the time.

with st.expander("Preview Uploaded Data"):
col_1, col_2 = st.columns(2)
with col_1:
st.write(f"{labels[0]}:")
st.dataframe(df_lookup_table)
with col_2:
st.write(f"{labels[1]}:")
st.dataframe(df_value_table)

It should look like this when closed:

And like this when opened:

Next, we need to create a few helper functions, and the first will be for creating select tabs in the sidebar. The idea is so we can choose columns by column names. For example, you need to pair lookup values to lookups table column “value 3”. To do that we will create a function that takes in both data frames, allows to choose a specific column, and returns tables with chosen columns:

def sidebar_options(df_lookup_table, df_value_table):
selected_lookup_values = st.sidebar.selectbox(
"Select headings", df_lookup_table.columns, index=0
)
select_column_to_look_in = st.sidebar.selectbox(
"Select column to look in", df_lookup_table.columns, index=1
)
select_main_values = st.sidebar.selectbox(
"Select lookup value column", df_value_table.columns, index=0
)

lookup_table = pd.DataFrame(
{
"keys": df_lookup_table[selected_lookup_values],
"values": df_lookup_table[select_column_to_look_in],
}
)
value_table = pd.DataFrame({"keys": df_value_table[select_main_values]})

return lookup_table, value_table

As a result there now will be a side bar with options for column choice. Both Excels have to be uploaded to make this work. And it will look like this:

As the function will return the final lookup table data frame and value table data frame, then we can finally merge data together accordingly to the value table and get the result:

df_lookup_table_final, df_lookup_values_final = sidebar_options(
df_lookup_table, df_value_table
)

merged_df = pd.merge(
df_lookup_values_final, df_lookup_table_final, on="keys", how="left"
)

st.dataframe(merged_df)

This will give final result:

We can create one more function and use one more streamlit widget so we can download results. So first let's create a function which will download the final result:

def create_excel(data_frame):
output = BytesIO()
with pd.ExcelWriter(output, engine="xlsxwriter") as writer:
data_frame.to_excel(writer, index=False, sheet_name="Results")
workbook = writer.book
worksheet = writer.sheets["Results"]
format1 = workbook.add_format({"num_format": "0.00"})
worksheet.set_column("A:A", None, format1)
return output.getvalue()

At the end add a download button widget:

st.download_button(
label="Download result",
data=create_excel(merged_df),
file_name="result.xlsx"
)

Now we have finished the coding part which works locally on your computer. To add a project to the web we need to upload the project to GitHub. Create a new project on Github, upload “main.py” and create one more file which is called “requirements.txt” and also upload it to GitHub. In the file copy the following:

pandas
openpyxl
xlsxwriter

This will tell Streamlit what libraries are necessary. The final part is to go to the Streamlits web page and sign in using a GitHub account. Create “New app” and choose “From existing repo”:

Next, fill in the info about your GitHub project. In “Repository” add the project. When clicking on the input block it will start to browse through all your GitHub projects, choose the project that you uploaded on GitHub. If you didn't change anything then the branch should be main. And last is a file path, in this case, it's gonna be a file name for the file you uploaded on GitHub.

Next, click “Deploy” and after that, it's going to take some time to publish your project. Just know, that if you don't use the web page for 5 days web app will be in “sleeping mode”. It means that after 5 days of the non-using page, you will need to “wake” it, in reality, it means that the app will redeploy and will take some time to do that.

The final result in streamlit can be found here (if it's sleeping, just “wake” it up). The final version is a bit updated with some other features and demo mode if you don't have Excel files, but the main concept does not change.

Project on the GitHub can be found here.

In the future, I'm planning to update with some additional popular Excel functions, but for now thanks for reading!

--

--

few hours with…
few hours with…

Written by few hours with…

Writing a blog about learning and exploring new stuff just for fun.

Responses (1)