Skip to main content

Learning Design Patterns with Bayesian Grammar Induction


-By Jerry O. Talton Intel Corporation,  Lingfeng, Yang Stanford University, Ranjitha Kumar Stanford University, Maxine Lim Stanford University, Noah D. Goodman Stanford University, Radom´ır Mech ˇ Adobe Corporation


This blog is extension to the earlier blog.




ABSTRACT

Design patterns have proven useful in many creative fields, providing content creators with archetypal, reusable guidelines to leverage in projects. Creating such patterns, however, is a time-consuming, manual process, typically relegated to a few experts in any given domain. In this paper, we describe an algorithmic method for learning design patterns directly from data using techniques from natural language processing and structured concept learning. Given a set of labeled, hierarchical designs as input, we induce a probabilistic formal grammar over these exemplars. Once learned, this grammar encodes a set of generative rules for the class of designs, which can be sampled to synthesize novel artifacts. We demonstrate the method on geometric models and Web pages, and discuss how the learned patterns can drive new interaction mechanisms for content creators.

courtesy: webflow

As creative fields mature, a set of best practices emerge for design. Often, attempts are made to codify these practices into a set of formal rules for designers which set out principles of composition, describe useful idioms, and summarize common aesthetic sensibilities. Such design patterns have proven popular and influential in fields such as architecture, software engineering, interaction, and Web design.

Despite their popularity, design patterns are also problematic. For one, they are difficult to operationalize: users bear the burden of locating reputable sources of design knowledge, assimilating that knowledge, and applying it to their own problems. For another, patterns must be painstakingly formulated and compiled by experts, resulting in guidelines that may be less descriptive of actual practice than prescriptive of a particular designer’s point of view.

A more attractive proposition is to learn design patterns directly from data and encapsulate them in a representation that can be accessed algorithmically. In this paper, we address this problem for one common class of designs: those comprising a hierarchy of labelled components. Such hierarchies abound in creative domains as diverse as architecture, geometric modelling, document layout, and Web design.

To learn patterns in a principled way, we leverage techniques from natural language processing and structured concept learning. In particular, we cast the problem as grammar induction: given a corpus of example designs, we induce a probabilistic formal grammar over the exemplars. Once learned, this grammar gives a design pattern in a human-readable form that can be used to synthesize novel designs and verify extant constructions.

The crux of this induction is learning how to generalize beyond the set of exemplars: we would like to distil general principles from the provided designs without extrapolating patterns that are not supported by the data. To this end, we employ an iterative structure learning technique called Bayesian Model Merging, which formulates grammar induction as Bayesian inference. The method employs an inductive bias based on the law of succinctness, also known as Occam’s razor, searching for the simplest grammar that best fits the examples. Since compactness and generality are inexorably linked in grammar-based models, the method provides a data-driven way to learn design patterns that are neither too specific nor overly general.

We demonstrate the method on two distinct classes of designs: geometric models (based on scene graphs), and Web pages (based on Document Object Model trees). We report on statistical analyses of the grammars induced by our technique, share the results from a small user study, and discuss how these sorts of probabilistic models could lead to better tools and interaction mechanisms for design.

ALGORITHM OVERVIEW

The method takes as input a set of designs in the form of labelled trees, where each label is drawn from a discrete dictionary C. The algorithm begins by traversing each tree and creating a production rule for every node to generate the least general conforming grammar (LGCG). The grammar is conforming in the sense that every exemplar is a valid derivation from it; it is the least-general such grammar because it derives only the exemplars, with no additional generalization capacity. Once this grammar is constructed, Markov chain Monte Carlo optimization is used to explore a series of more general conforming grammars by merging and splitting nonterminal symbols. Each merge operation takes two nonterminals, rewrites them to have a common name, and unions their productions; each split operation is the reverse of a merge. To judge the quality of each candidate grammar, we adopt a Bayesian interpretation that balances the likelihood of the exemplar designs against the description length of the grammar. At each step in the optimization, we randomly select a split or merge move to apply, and evaluate the posterior of the resultant grammar. This search procedure is run until it exhausts a predetermined computational budget and the maximum a posteriori estimate is returned.





Figure 4 shows typical samples from the MGCG and the optimal grammar induced from a set of spaceship models. Models produced from the MGCG resemble the exemplar models only locally; conversely, the models synthesized with our technique exhibit similar global structure.


Figure 5 shows fifty distinct random samples from a design pattern induced from six Japanese castle models;




Figure 6 shows fifty distinct random samples from a design pattern induced from eight different sakura tree models.



Figure 7 shows modes from the distribution defined by the grammar of Seussian architecture, along with their probabilities. While the majority of the produced designs are plausible, these samples also highlight some of the limitations of our framework (highlighted in red). Because we induce context-free grammars, it is not possible for these design patterns to reliably learn high-level semantic constraints like “every building must have at least one door.”

Conclusion

One final direction for investigation is learning more powerful computational models for design. Although stochastic context-free grammars provide a useful and compact generative representation, they are subject to a number of limitations which have led content creators to seek out more powerful graphical models. For one, SCFGs automatically assign higher probabilities to shorter derivations, an assumption well-founded neither in natural language nor design, where model size typically peaks at some intermediate length. For another, the independence assumptions inherent in SCFGs prevent them from accurately representing models which have more general graph (rather than tree) structure, precluding them from capturing symmetries or other distant relationships between disjoint subtrees in a derivation. Similarly, CFGs are fundamentally discrete representations, and cannot easily encode continuous variability.

Comments

Popular posts from this blog

ABOD and its PyOD python module

Angle based detection By  Hans-Peter Kriegel, Matthias Schubert, Arthur Zimek  Ludwig-Maximilians-Universität München  Oettingenstr. 67, 80538 München, Germany Ref Link PyOD By  Yue Zhao   Zain Nasrullah   Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada  Zheng Li jk  Northeastern University Toronto, Toronto, ON M5X 1E2, Canada I am combining two papers to summarize Anomaly detection. First one is Angle Based Outlier Detection (ABOD) and other one is python module that  uses ABOD along with over 20 other apis (PyOD) . This is third part in the series of Anomaly detection. First article exhibits survey that covered length and breadth of subject, Second article highlighted on data preparation and pre-processing.  Angle Based Outlier Detection. Angles are more stable than distances in high dimensional spaces for example the popularity of cosine-based similarity measures for text data. Object o is an out

Ownership at Large

 Open Problems and Challenges in Ownership Management -By John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Shan He, Ralf Lämmel, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers Facebook Inc.  Software-intensive organizations rely on large numbers of software assets of different types, e.g., source-code files, tables in the data warehouse, and software configurations. Who is the most suitable owner of a given asset changes over time, e.g., due to reorganization and individual function changes. New forms of automation can help suggest more suitable owners for any given asset at a given point in time. By such efforts on ownership health, accountability of ownership is increased. The problem of finding the most suitable owners for an asset is essentially a program comprehension problem: how do we automatically determine who would be best placed to understand, maintain, evolve (and

Hybrid Approach to Automation, RPA and Machine Learning

- By Wiesław Kopec´, Kinga Skorupska, Piotr Gago, Krzysztof Marasek  Polish-Japanese Academy of Information Technology Paper Link Courtesy DZone   Abstract One of the more prominent trends within Industry 4.0 is the drive to employ Robotic Process Automation (RPA), especially as one of the elements of the Lean approach.     The full implementation of RPA is riddled with challenges relating both to the reality of everyday business operations, from SMEs to SSCs and beyond, and the social effects of the changing job market. To successfully address these points there is a need to develop a solution that would adjust to the existing business operations and at the same time lower the negative social impact of the automation process. To achieve these goals we propose a hybrid, human-centred approach to the development of software robots. This design and  implementation method combines the Living Lab approach with empowerment through participatory design to kick-start the