Skip to main content

Learning Design Patterns with Bayesian Grammar Induction


-By Jerry O. Talton Intel Corporation,  Lingfeng, Yang Stanford University, Ranjitha Kumar Stanford University, Maxine Lim Stanford University, Noah D. Goodman Stanford University, Radom´ır Mech ˇ Adobe Corporation


This blog is extension to the earlier blog.




ABSTRACT

Design patterns have proven useful in many creative fields, providing content creators with archetypal, reusable guidelines to leverage in projects. Creating such patterns, however, is a time-consuming, manual process, typically relegated to a few experts in any given domain. In this paper, we describe an algorithmic method for learning design patterns directly from data using techniques from natural language processing and structured concept learning. Given a set of labeled, hierarchical designs as input, we induce a probabilistic formal grammar over these exemplars. Once learned, this grammar encodes a set of generative rules for the class of designs, which can be sampled to synthesize novel artifacts. We demonstrate the method on geometric models and Web pages, and discuss how the learned patterns can drive new interaction mechanisms for content creators.

courtesy: webflow

As creative fields mature, a set of best practices emerge for design. Often, attempts are made to codify these practices into a set of formal rules for designers which set out principles of composition, describe useful idioms, and summarize common aesthetic sensibilities. Such design patterns have proven popular and influential in fields such as architecture, software engineering, interaction, and Web design.

Despite their popularity, design patterns are also problematic. For one, they are difficult to operationalize: users bear the burden of locating reputable sources of design knowledge, assimilating that knowledge, and applying it to their own problems. For another, patterns must be painstakingly formulated and compiled by experts, resulting in guidelines that may be less descriptive of actual practice than prescriptive of a particular designer’s point of view.

A more attractive proposition is to learn design patterns directly from data and encapsulate them in a representation that can be accessed algorithmically. In this paper, we address this problem for one common class of designs: those comprising a hierarchy of labelled components. Such hierarchies abound in creative domains as diverse as architecture, geometric modelling, document layout, and Web design.

To learn patterns in a principled way, we leverage techniques from natural language processing and structured concept learning. In particular, we cast the problem as grammar induction: given a corpus of example designs, we induce a probabilistic formal grammar over the exemplars. Once learned, this grammar gives a design pattern in a human-readable form that can be used to synthesize novel designs and verify extant constructions.

The crux of this induction is learning how to generalize beyond the set of exemplars: we would like to distil general principles from the provided designs without extrapolating patterns that are not supported by the data. To this end, we employ an iterative structure learning technique called Bayesian Model Merging, which formulates grammar induction as Bayesian inference. The method employs an inductive bias based on the law of succinctness, also known as Occam’s razor, searching for the simplest grammar that best fits the examples. Since compactness and generality are inexorably linked in grammar-based models, the method provides a data-driven way to learn design patterns that are neither too specific nor overly general.

We demonstrate the method on two distinct classes of designs: geometric models (based on scene graphs), and Web pages (based on Document Object Model trees). We report on statistical analyses of the grammars induced by our technique, share the results from a small user study, and discuss how these sorts of probabilistic models could lead to better tools and interaction mechanisms for design.

ALGORITHM OVERVIEW

The method takes as input a set of designs in the form of labelled trees, where each label is drawn from a discrete dictionary C. The algorithm begins by traversing each tree and creating a production rule for every node to generate the least general conforming grammar (LGCG). The grammar is conforming in the sense that every exemplar is a valid derivation from it; it is the least-general such grammar because it derives only the exemplars, with no additional generalization capacity. Once this grammar is constructed, Markov chain Monte Carlo optimization is used to explore a series of more general conforming grammars by merging and splitting nonterminal symbols. Each merge operation takes two nonterminals, rewrites them to have a common name, and unions their productions; each split operation is the reverse of a merge. To judge the quality of each candidate grammar, we adopt a Bayesian interpretation that balances the likelihood of the exemplar designs against the description length of the grammar. At each step in the optimization, we randomly select a split or merge move to apply, and evaluate the posterior of the resultant grammar. This search procedure is run until it exhausts a predetermined computational budget and the maximum a posteriori estimate is returned.





Figure 4 shows typical samples from the MGCG and the optimal grammar induced from a set of spaceship models. Models produced from the MGCG resemble the exemplar models only locally; conversely, the models synthesized with our technique exhibit similar global structure.


Figure 5 shows fifty distinct random samples from a design pattern induced from six Japanese castle models;




Figure 6 shows fifty distinct random samples from a design pattern induced from eight different sakura tree models.



Figure 7 shows modes from the distribution defined by the grammar of Seussian architecture, along with their probabilities. While the majority of the produced designs are plausible, these samples also highlight some of the limitations of our framework (highlighted in red). Because we induce context-free grammars, it is not possible for these design patterns to reliably learn high-level semantic constraints like “every building must have at least one door.”

Conclusion

One final direction for investigation is learning more powerful computational models for design. Although stochastic context-free grammars provide a useful and compact generative representation, they are subject to a number of limitations which have led content creators to seek out more powerful graphical models. For one, SCFGs automatically assign higher probabilities to shorter derivations, an assumption well-founded neither in natural language nor design, where model size typically peaks at some intermediate length. For another, the independence assumptions inherent in SCFGs prevent them from accurately representing models which have more general graph (rather than tree) structure, precluding them from capturing symmetries or other distant relationships between disjoint subtrees in a derivation. Similarly, CFGs are fundamentally discrete representations, and cannot easily encode continuous variability.

Comments

Popular posts from this blog

ABOD and its PyOD python module

Angle based detection By  Hans-Peter Kriegel, Matthias Schubert, Arthur Zimek  Ludwig-Maximilians-Universität München  Oettingenstr. 67, 80538 München, Germany Ref Link PyOD By  Yue Zhao   Zain Nasrullah   Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada  Zheng Li jk  Northeastern University Toronto, Toronto, ON M5X 1E2, Canada I am combining two papers to summarize Anomaly detection. First one is Angle Based Outlier Detection (ABOD) and other one is python module that  uses ABOD along with over 20 other apis (PyOD) . This is third part in the series of Anomaly detection. First article exhibits survey that covered length and breadth of subject, Second article highlighted on data preparation and pre-processing.  Angle Based Outlier Detection. Angles are more stable than distances in high dimensional spaces for example the popularity of cosine-based sim...

Cybersecurity Threats in Connected and Automated Vehicles based Federated Learning Systems

  Ranwa Al Mallah , Godwin Badu-Marfo , Bilal Farooq image Courtesy: Comparitech Abstract Federated learning (FL) is a machine learning technique that aims at training an algorithm across decentralized entities holding their local data private. Wireless mobile networks allow users to communicate with other fixed or mobile users. The road traffic network represents an infrastructure-based configuration of a wireless mobile network where the Connected and Automated Vehicles (CAV) represent the communicating entities. Applying FL in a wireless mobile network setting gives rise to a new threat in the mobile environment that is very different from the traditional fixed networks. The threat is due to the intrinsic characteristics of the wireless medium and is caused by the characteristics of the vehicular networks such as high node-mobility and rapidly changing topology. Most cyber defense techniques depend on highly reliable and connected networks. This paper explores falsified informat...

MLOps Drivenby Data Quality using ease.ml techniques

 Cedric Renggli, Luka Rimanic, Nezihe Merve Gurel, Bojan Karlas, Wentao Wu, Ce Zhang ETH Zurich Microsoft Research Paper Link ease.ml reference paper link Image courtesy 99designes Developing machine learning models can be seen as a process similar to the one established for traditional software development. A key difference between the two lies in the strong dependency between the quality of a machine learning model and the quality of the data used to train or perform evaluations. In this work, we demonstrate how different aspects of data quality propagate through various stages of machine learning development. By performing joint analysis of the impact of well-known data quality dimensions and the downstream machine learning process, we show that different components of a typical MLOps pipeline can be efficiently designed, providing both a technical and theoretical perspective. Courtesy: google The term “MLOps” is used when this DevOps process is specifically applied to ML. Diffe...