Introduction

In today’s data-driven business landscape, the efficient handling of data is critical for making informed decisions and shaping effective strategies. However, managing data from various sources can be a time-consuming and error-prone process. In this blog post, we will explore a Python script that automates the flow of data from a marketing system accessible through an API to a Google Sheet. This automation not only eliminates manual data entry but also enhances efficiency, ensuring up-to-date and error-free data for analysis and reporting.

The need for this automation arose from the challenges faced by a team that required a dashboard based on data stored in a Google Sheet. Historically, the Google Sheet had been filled manually with data from the marketing system. Recognizing that the marketing system provided an API, the team made a strategic decision to automate the process of updating the Google Sheet with fresh data automatically.

Download the following script at: denioflavio/demo (github.com)

Understanding the Python Script

To gain a deeper understanding, let’s dissect the Python script step by step:

Section 1: Importing Required Libraries

import sys
import urllib
import requests
import gspread
from oauth2client.service_account import ServiceAccountCredentials

The script begins by importing essential Python libraries. These libraries include sys, urllib, requests, gspread, and ServiceAccountCredentials from the oauth2client.service_account. These libraries are essential for making API requests and connecting to Google Sheets.

Section 2: API Authentication and Data Retrieval

# API Connection and GET Request for Receiving Data in JSON Format
token = 'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJhdWQiOiIzMjki...'

headers = {
    'Content-Type': "application/json",
    'Authorization': 'Bearer ' + token
}

# Prepare the query string
url = 'https://api.example.com/'
data_ini = '2000/01/01'
querystring = {"date_from": data_ini, "per_page": 10}
params = urllib.parse.urlencode(querystring, safe='')

# Send the GET request
try:
    res = requests.request("GET", url, headers=headers, params=params)
except requests.exceptions.RequestException as e:
    print('Error in GET request: %s', str(e))

try:
    data = res.json()
except requests.exceptions.JSONDecodeError as e:
    print("Erro no parse JSON: %s", str(e))

This section focuses on API authentication and data retrieval. The script sets up the API authentication by using a token and sends a GET request to the marketing system’s API. The token and headers are used for authorization, and the requests library is employed to send the GET request with specified query parameters.

Section 3: Token Validation

# If there is an error due to an invalid token, attempt to update it
if data.get('message') == 'Unauthenticated.':
    print('Invalid token, updating.')
    

Following data retrieval, this part of the script checks if the API response contains the message “Unauthenticated.” If the token is found to be invalid or expired, the script is aborted.

Section 4: Integration with Google Sheets

This section is crucial for integrating the script with Google Sheets. Let’s break it down line by line:

# Integration with Google Sheets requires integration keys and the sheet's ID
# Load the sheet, clear it, and insert the header in the first row
print('Loading the Google Drive spreadsheet')
  • The comment underscores the significance of integrating the script with Google Sheets, which necessitates the presence of integration keys and the sheet’s unique identifier.
  • The print statement serves to notify the user that the Google Sheet is being loaded.
key = 'integration-example-dc09e266cb63.json'
  • In this line, the script assigns the file name ‘integracao-followize-dc09e266cb63.json’ to the key variable. This JSON file typically contains the credentials required for accessing Google Sheets.
  • This file is provided by Google. Follow this instructions to obtain yours.
id_sheet = '1PKT7Ax_vzGCBz71akLu33eQ1GBqdGrwcgVxclcl80cI'
  • The id_sheet variable is assigned the value ‘1PKT7Ax_vzGCBz71akLu3geQ1HBqqGrwcgVxclcl80cI,’ which is the unique identifier of the Google Sheet to which the data will be inserted.
scope = ["https://www.googleapis.com/auth/spreadsheets"]
  • This line specifies the desired scope of access for Google Sheets, including authorization for managing spreadsheets.
creds = ServiceAccountCredentials.from_json_keyfile_name(key, scope)
  • The script utilizes the ServiceAccountCredentials.from_json_keyfile_name method to create credentials using the provided JSON key file (key) and the specified scope (scope). These credentials are fundamental for authenticating the script with Google Sheets.
client = gspread.authorize(creds)
  • The client variable is set to the result of authorizing the script using the previously generated credentials. This step enables the script to interact with Google Sheets on behalf of the authorized user.
spreadsheet = client.open_by_key(id_sheet)
  • Here, the spreadsheet variable is assigned the result of opening the Google Sheet identified by id_sheet. This allows the script to access and manipulate the specified Google Sheet.
sheet = spreadsheet.get_worksheet(0)
  • Finally, the sheet variable is set to the first worksheet (index 0) within the opened Google Sheet. This is the target worksheet where the data will be populated.
# Clear the spreadsheet
sheet.clear()
  • The accompanying comment indicates that the script is about to clear the contents of the selected Google Sheet, preparing it for data insertion. This is achieved using the sheet.clear() method, ensuring a clean slate before the new data is inserted.

Section 5: Data Processing and Insertion

# Record lead data in a list, one lead at a time
total_rows = []
for lead in data['data']:
    row = []
    row.append(lead['id'])
    row.append(lead['created_at'])
    row.append(lead['contact']['address']['city'])
    row.append(lead['status'])
    row.append(lead['interests']['interest_1']['name'])
    row.append(lead['conversion_goal'])
    row.append(lead['tracking']['source'])
    row.append(lead['tracking']['medium'])
    total_rows.append(row)

In this section, the script processes the lead data retrieved from the API. It iterates through each lead, extracting essential information such as lead ID, creation date, city, status, interests, conversion goals, and tracking details. This information is structured into rows and appended to a list named total_rows.

Section 6: Populating the Google Sheet

# Send the list of leads to the spreadsheet
sheet.append_rows(total_rows)

In the final step, the script appends the processed data.

Conclusion

Automating the flow of data from marketing systems to Google Sheets, as demonstrated by this Python script, is a valuable way to save time and ensure data accuracy. This automation eliminates the need for manual data entry, reduces the risk of errors, and provides up-to-date information for analysis and decision-making. This script empowers businesses to leverage their data more effectively, focusing on deriving insights rather than performing data entry and updates manually.

Incorporating this script into your workflow can significantly improve data management and reporting processes, providing a competitive advantage in today’s data-driven business landscape. Python’s versatility in handling data-related tasks makes it a powerful tool for streamlining business operations.

Categories: GoogleJSON

0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *