Karl the Planner Bot

Made by Angela Wang

Found in Final Project - Weird Conversations

Karl the Planner Bot is a SMS bot that tells you about the ongoing development projects in San Francisco.

1

Intention

Having been a city planner, I have always been interested in the ways cities communicate with their residents. Most cities have very detailed data about the land uses and changes on by parcels, but the data is used primarily for technical reasons. Without GIS or advanced data processing techniques, the information is very difficult to access for populations (ex. local advocates, developers, community organizers) who may find the information useful in their work.

This bot is intended to answer the question:

 "How might we leverage existing data to build encourage more residents to pay more attention to changes in their neighborhood – through a chat bot?"

I wanted to explore how far I may repurpose a very technical database for more casual uses cases, such as residents looking into what's up in their neighborhood.

0

Context

There has been a movement amongst governments to incorporate chat bots as an alternative interface for public services. Most of these bots are built on top of the framework of a website to provide more efficient ways for citizens to get to the web pages they need to go. Examples include:

  • "Emma" for the US Citizenship and Immigration Service (USCIS) 
  • "Open Data KC" for Kansas City
  • "Alex" for the Australian Taxation Office
  • "PMC" for the Pune Municipal Corporation in India
  • "MISSI" for Mississippi State

On the other hand, there has been few chat bots that carry real governmental services or intelligent interactions. An example would be PAIGE, built in early 2018 by Yeti, that is designed to automate the procurement process for city staff in the City of San Francisco. 

Seeing this opportunity, I want to build a bot that does task beyond navigation but providing real insights and services to the citizens.

0

Process

1. Understanding the Data

Having built the "San Francisco Expert" MeBot (https://bit.ly/2yM29Xu), the initial API setup was actually pretty straightforward – except that I actually decided not to use API. The database was big and for the most part poorly kept in terms of format consistency. I downloaded it as a .csv, cleaned up the data a bit, and save it as .json. 

2. Designing Bot Personality

Building the bot personality was one of the toughest part of the process and one part I wish I have pushed more. I explored various types of personality – (1) a long-term SF residents who is opinionated about SF development, (2) a friendly corner store owner and a neighborhood know-how, and finally, (3) a City Hall intern who is courteous and with limit knowledge. 

Because the data was so technical, it was really hard to jazz up and make the information flow into everyday conversations ("I can't believe that department building only has 13 affordable units!"). As blend as the final bot's personality turned out to be, I actually appreciated the mild formality and was happy to arrive at the tone

0

3. Building the Dialog Flow & Tone

I started with a very ambitious flow that involved the bot recommending projects near the user and provide updates on the project the user subscribed. Very quickly I realized the complexity and technical work that goes into setting up all the options. 

Eventually I focused on building a no-dead-end chat flow that (1) clearly outlined the options, (2) minimize memorizing key words (ex. refresh), (3) providing options for feedback and improvement.

0

Product

The final product is a chatbot that is designed for planners or individuals who are somewhat familiar with urban development to quickly learn about ongoing project in San Francisco.

The bot provides two main modes of function – (1) providing a summary of the neighborhood or (2) provide a random ongoing project in the city (as a fun fact). 

0

1. Neighborhood Specific Info

Users can ask about information by neighborhood in San Francisco. The current version of Karl does require users to know the SF neighborhoods.

0

The original data from DataSF's API had information by "Planning District" –  a region unit used specifically by planners but not by the general public, who is more familiar with "neighborhoods". To overcome this without altering the original data, I did some quick mapping and created another database that includes roughly the Planning District of each neighborhood.

0

The actual code includes two methods – one (determine_PD) to determine whether the input neighborhood has a valid matching Planning District and another (determine_PD_info) to scan through the database and count the number or projects in this Planning District.

0
def determine_PD x
    
    #this is the answer to the question "Which neighborhood do you live in?"
    csv_SF_Neighborhood = 'SF_Neighborhood.csv'
    search_criteria_neighborhood = { 'Neighborhood_AW' => x}

    options = { :headers => :first_row,
                :converters => [ :numeric ]}

    neighborhood_matches = nil

    CSV.open(csv_SF_Neighborhood, "r", options) do |csv|

        neighborhood_matches = csv.find_all do |row|
            match = true
            search_criteria_neighborhood.keys.each do |key|
               match = match && ( row[key] == search_criteria_neighborhood[key]) 
            end
            match
        end    
    end
    
    

    # if more than one PD match (Outer Mission, Castro, West of Twin Peaks, Sunset)
    if neighborhood_matches.length >=2
        if x == "Outer Mission"
            "2PD - Outer Mission"
        elsif x == "Castro"
            "2PD - Castro"    
        elsif x == "West of Twin Peaks"
            "2PD - West of Twin Peaks"
        elsif x == "Sunset"
            "2PD - Sunset"
        else
            "no match"
        end 
    # if one match - perfect!    
    elsif neighborhood_matches.length == 1
        # based on the data there is no project in Presidio
        if x == "Presidio"
            "no project"
        else
            # ================= print result – start =================
            (neighborhood_matches[0])[0]                        
            # ================= print result – end =================
            session["PD"] = (neighborhood_matches[0])[0]        #only one item [0], and PD is column 0 (the "1 - Richmond" format)
            session["PD_SIMP"] = (neighborhood_matches[0])[1]     #only one item [0], and PD_SIMP is column 1 (the "Richmond" format)
        end
    # if no match 
    else
        "no match"
    end
    
    
end
Click to Expand
0
def determine_PD_info x
        
        match_project_NAMEADDR = []
        match_project_BESTSTAT = []
        match_project_DBIDESC = []

        match_project_AFFORDABLENET = []

        #match_project_Longitude = []
        #match_project_Latitude = []

        sum_match_project = 0
        match_project_construction = 0
        match_project_approved = 0
        match_project_underreview = 0

        random_n = rand(sum_match_project)
        random_proj_NAMEADDR = ""
        random_proj_BESTSTAT = ""
        random_proj_DBIDESC = ""

        # THIS CURRENTLY DOESN'T WORK 

        file= File.read('SF_Development_Pipeline_2018_Q2_AWedit.json')
        all_projects=JSON.parse(file)

            all_projects.each do |row|
                if row["NAMEADDR"]!=0
                    if row["PD"].to_s == session["PD"]  
                        match_project_NAMEADDR.push(row["NAMEADDR"])                
                        match_project_BESTSTAT.push(row["BESTSTAT"])
                        match_project_DBIDESC.push(row["DBIDESC"])
                    end
                end
             end

        
        sum_match_project = match_project_NAMEADDR.count
        #sum_affordablenet = match_project_AFFORDABLENET.map(&:to_i).reduce(:+)
        match_project_construction = match_project_BESTSTAT.count("CONSTRUCTION")
        match_project_approved = match_project_BESTSTAT.count("BP APPROVED") + match_project_BESTSTAT.count("PL APPROVED") + match_project_BESTSTAT.count("PL Approved")
        match_project_underreview = match_project_BESTSTAT.count - match_project_construction - match_project_approved

        # ================= print result – start ================
        "Found it! " + x +" is actually part of the larger " + session["PD_SIMP"] + " Planning District, which has " + match_project_construction.to_s + " projects under construction, " + match_project_approved.to_s + " approved, and " + match_project_underreview.to_s + " under review."
        
        # ================= print result – end =================

end
Click to Expand
0

2. Any Project in the Neighborhood

After learning about the project in the Planning District, there is an option to look into details of a project. 

0

The method (determine_random_project) in this case reads the full database, put the data of with the specific PD, and then randomly choose one to output.

0
def determine_random_project x   
     
    input = x
    
    match_project_NAMEADDR = []
    match_project_BESTSTAT = []
    match_project_DBIDESC = []

    file= File.read('SF_Development_Pipeline_2018_Q2_AWedit.json')
        all_projects=JSON.parse(file)
            all_projects.each do |row|
                if row["NAMEADDR"]!= 0
                    if row["PD"].to_s == session["PD"]  
                        match_project_NAMEADDR.push(row["NAMEADDR"])                
                        match_project_BESTSTAT.push(row["BESTSTAT"])
                        match_project_DBIDESC.push(row["DBIDESC"])
                    else 
                        match_project_NAMEADDR.push(row["NAMEADDR"])                
                        match_project_BESTSTAT.push(row["BESTSTAT"])
                        match_project_DBIDESC.push(row["DBIDESC"])
                    end
                end
             end
        
    
    
    sum_match_project = match_project_NAMEADDR.count
    random_n = rand(sum_match_project)
    random_proj_NAMEADDR = match_project_NAMEADDR[random_n].downcase.strip
    random_proj_BESTSTAT = match_project_BESTSTAT[random_n].downcase.strip
    random_proj_DBIDESC = match_project_DBIDESC[random_n].downcase.strip



    # ================= print result – start =================
    #if I have time i'd add conditions to fix the phrase before "beststat"
    random_proj_NAMEADDR + ", currently under the status " + random_proj_BESTSTAT + ", will " + random_proj_DBIDESC + ". "    
    # ================= print result – end =================
    
end
Click to Expand
0

2. Random Project Info

Alternatively, one may choose to know info of any project in San Francisco (without the PD filter). Here it uses the same method as above (determine_random_project). Since session["PD"] is refreshed to "", the method will simply scan through all projects.

0

Here is the full interaction:

0
Karl the Planner Bot
Angela Wang - https://youtu.be/-9Atrc46Ogg
0

I knew walking into this project I'd be working with a dataset whose technicality will cause limitation in how naturally-sounding the resulting conversation may be. I wanted to use this project to make a point that with little edits, technical data can be used in a wider range of use cases and benefit a larger population. 

If I had more time, here are a few items I would like to tackle:

  • tweak the language of technical terms to be more natural-sounding (ex. bp filed --> Business Permit Filed)
  • better handle blank/missing data in a sentence thread
  • incorporate database to remember user's request and provide project update when the status of the projects changes
  • work more on the personality of the bot and its tone of voice

Overall it was a great learning experience, and I hope to push the bot a bit further to function as a true resource for the public!

x
Share this Project

Courses

49714 Programming for Online Prototypes

· 9 members

A hands on introduction to building online products and services through code


About

Karl the Planner Bot is a SMS bot that tells you about the ongoing development projects in San Francisco.

Created

October 18th, 2018