Representing procedural knowledge as programs

Principal Investigator: Graham Neubig, Assistant Professor, Language Technologies Institute, School of Computer Science 

Co-PI: Eduard Hovy, Research Professor, Language Technologies Institute, School of Computer Science

We have received funding from the Carnegie Bosch Institute for Representing Procedural Knowledge as Programs. As technology progresses, smart assistants are expected to do increasingly complicated tasks. For example, a user in a smart home may ask “assistant, check how many eggs I have in the refrigerator.” This deceptively simple query requires rich procedural knowledge. How do we “find the refrigerator,” “check how many,” or “identify eggs?” In current personal assistants, these skills are hand-coded by system developers, which requires significant labor, cannot cover the long tail of requests, and leads to difficulty in adapting to new users. While declarative knowledge bases of propositional symbols and expressions are of great use in virtual agents, large-scale knowledge bases of procedural knowledge have remained largely unexplored.

We propose new methods for (1) representing procedural knowledge as programs, specifically programs in widely used programming languages such as Python, and (2) automatically creating/learning such knowledge. To use this knowledge in personal assistants, we will expand on our previous work (Yin and Neubig, 2017; Dasigi and Hovy, in prep.) on neural network methods for natural language program synthesis. We will address three questions:

  1. How to create and formalize large-scale procedural knowledge by mining existing sources found on the web? (e.g. QA sites such as StackOverflow)
  2. How to merge the extracted procedural knowledge with existing declarative knowledge bases and machine-learned classifiers? (e.g. an object detector to “identify eggs”)
  3. How to help users create personalized procedural knowledge bases by teaching the agent new skills? (e.g. “check how many X in the refrigerator” is composed from multiple component skills)