====== Reading memo for GRL ====== * Book: GRL (Graph Representation Learning) by W.L. Hamilton (2020) General points for **thorough** reading (not extensive reading) * Read **every** words and mark/memorize important words. * Try to understand **every piece** the authors want to tell us - if needed, read other reference(s). * However, if you cannot understand something after you have tried more than 30 minutes, discuss it with others. * Try to use low-entropy description in explaining. ===== #01, liangz ===== range: from the title page to p8 (end of Chapter 1). ** NOTICE: ** please read both the scanned pictures and this memo. i: title page * official book * author William L. Hamilton was an Assistant Professor at that time at McGill Univ. * McGill Univ., a famous university in Canada, 12 Nobel laureates, and graduate Yoshua Bengio (2018 ACM Turing Award, one of the three Godfathers of deep learning) ii: Abstract page * no-free-lunch theorem * => inductive bias is important before optimization (and before machine learning). Ex. Consider to use a linear regression to analyze a dataset. The assumption that a linear regression is approporiate is an inductive bias. * What is induction? * Lin & Tegmark'16 Why does deep and cheap learning work so well? => Assumptions: (1) Low polynomial order in the real world (2) Locality of data (3) Symmetry phenomenon. * 3 graph learning topics: (1) embedding (2) CNN -> Graph (3) Message-passing approach iii-v: Contents * not important now vi: Preface * "past seven years" => graph learning started from 2013 * at 2020, fastest growing sub-areas of deep learning * audience: shall have some background in machine learning and deep learning (e.g., Goodfellow et al, 2016) vii-viii: Acknowledgements * connections of the author (e.g., Jure Leskovec is a famous researcher in network science) p1: Chapter 1 Introduction * node, edge, relations, graph => ask the audience to illustrate some graphs * Zachary Karate Club Network (1977) and on its importance p2: * It mentioned "a dramatic increase in the quantity and quality of graph data in the last 25 years." => Why 25 years? (hint: It means since 1995) * ML is not the only way but may be interesting. * adjacency matrix, adjacency list, simple graph and {0,1} * (optional) graph processing from adjacency list to adjacency matrix seems an evidence showing human consumes matter and energy to increase its order. p3: * multi-graph (variable number of types/relations) vs multi-relational graph (fixed number of types/relations) * Heterogeneous graph (inner edges important than inter edges), multipartite graph (no inner edges, only inter edges), multiplex graph (inter edges important than inner edges) * attribute or feature p4: * graph (abstract structure) and network (real-world data) * supervised (predict an output) and unsupervised (infer pattern) * node classification: predict the label of a node given a small number of labelled nodes (|V_train| << |V|) p5: * applications: bot detection in a social network, function of proteins in the interactome, classify the topic based on links, etc * difference from a standard supervised learning: the assumption/bias of iid (independent and identically distributed) or no. * popular inductive bias used in graph learning: homophily (same attrubute with neighbors), structural equivalence (similar local structure -> similar label), heterophily (e.g., gender). p6: * supervised learning and semi-supervised learning, and GL (no iid assumption) * relation prediction: e.g., recommendation system, side-effect. Notice the requirement of inductive bias. p7: * clustering and community detection * graph classification, regression, and clustering (to the audience: what is the general difference?) p8: * iid assumption and why? -> Li-Yang * Additional comment: Causal relation and correlation. ML is often consider the latter approach but actually we need to consider the former.