====== Reading memo for GRL ======

* Book: GRL (Graph Representation Learning) by W.L. Hamilton (2020)

General points for **thorough** reading (not extensive reading)

  * Read **every** words and mark/memorize important words.
  * Try to understand **every piece** the authors want to tell us - if needed, read other reference(s).
  * However, if you cannot understand something after you have tried more than 30 minutes, discuss it with others.
  * Try to use low-entropy description in explaining.

===== #01, liangz =====

range: from the title page to p8 (end of Chapter 1).

** NOTICE: ** please read both the scanned pictures and this memo.

i: title page
  * official book
  * author William L. Hamilton was an Assistant Professor at that time at McGill Univ.
  * McGill Univ., a famous university in Canada, 12 Nobel laureates, and graduate Yoshua Bengio (2018 ACM Turing Award, one of the three Godfathers of deep learning)

ii: Abstract page
  * no-free-lunch theorem
  * => inductive bias is important before optimization (and before machine learning). Ex. Consider to use a linear regression to analyze a dataset. The assumption that a linear regression is approporiate is an inductive bias.
  * What is induction?
  * Lin & Tegmark'16 Why does deep and cheap learning work so well? => Assumptions: (1) Low polynomial order in the real world (2) Locality of data (3) Symmetry phenomenon.
  * 3 graph learning topics: (1) embedding (2) CNN -> Graph (3) Message-passing approach

iii-v: Contents
  * not important now

vi: Preface
  * "past seven years" => graph learning started from 2013
  * at 2020, fastest growing sub-areas of deep learning
  * audience: shall have some background in machine learning and deep learning (e.g., Goodfellow et al, 2016)

vii-viii: Acknowledgements
  * connections of the author (e.g., Jure Leskovec is a famous researcher in network science)

p1: Chapter 1 Introduction
  * node, edge, relations, graph => ask the audience to illustrate some graphs
  * Zachary Karate Club Network (1977) and on its importance

p2:
  * It mentioned "a dramatic increase in the quantity and quality of graph data in the last 25 years." => Why 25 years? (hint: It means since 1995)
  * ML is not the only way but may be interesting.
  * adjacency matrix, adjacency list, simple graph and {0,1}
  * (optional) graph processing from adjacency list to adjacency matrix seems an evidence showing human consumes matter and energy to increase its order.

p3:
  * multi-graph (variable number of types/relations) vs multi-relational graph (fixed number of types/relations)
  * Heterogeneous graph (inner edges important than inter edges), multipartite graph (no inner edges, only inter edges), multiplex graph (inter edges important than inner edges)
  * attribute or feature

p4:
  * graph (abstract structure) and network (real-world data)
  * supervised (predict an output) and unsupervised (infer pattern)
  * node classification: predict the label of a node given a small number of labelled nodes (|V_train| << |V|)

p5:
  * applications: bot detection in a social network, function of proteins in the interactome, classify the topic based on links, etc
  * difference from a standard supervised learning: the assumption/bias of iid (independent and identically distributed) or no.
  * popular inductive bias used in graph learning: homophily (same attrubute with neighbors), structural equivalence (similar local structure -> similar label), heterophily (e.g., gender).

p6:
  * supervised learning and semi-supervised learning, and GL (no iid assumption)
  * relation prediction: e.g., recommendation system, side-effect. Notice the requirement of inductive bias.

p7:
  * clustering and community detection
  * graph classification, regression, and clustering (to the audience: what is the general difference?)

p8:
  * iid assumption and why? -> Li-Yang
  * Additional comment: Causal relation and correlation. ML is often consider the latter approach but actually we need to consider the former.