Adds a TFF tutorial for compression.

PiperOrigin-RevId: 321852405

Adds a TFF tutorial for compression.
cfc3cff7 · Weikang Song · tensorflow-copybara · f058a1ac · cfc3cff7
Commit cfc3cff7 authored 4 years ago by Weikang Song Committed by tensorflow-copybara 4 years ago
--- a/docs/tutorials/tff_for_federated_learning_research_compression.ipynb
+++ b/docs/tutorials/tff_for_federated_learning_research_compression.ipynb
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "exFeYM4KWlz9"
+      },
+      "source": [
+        "##### Copyright 2020 The TensorFlow Authors."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "cellView": "form",
+        "colab": {},
+        "colab_type": "code",
+        "id": "Oj6X6JHoWtVs"
+      },
+      "outputs": [],
+      "source": [
+        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+        "# you may not use this file except in compliance with the License.\n",
+        "# You may obtain a copy of the License at\n",
+        "#\n",
+        "# https://www.apache.org/licenses/LICENSE-2.0\n",
+        "#\n",
+        "# Unless required by applicable law or agreed to in writing, software\n",
+        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+        "# See the License for the specific language governing permissions and\n",
+        "# limitations under the License."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "d5DZ2c-xfa9m"
+      },
+      "source": [
+        "# TFF for Federated Learning Research: Model and Update Compression\n",
+        "\n",
+        "**NOTE**: This colab has been verified to work with the [latest released version](https://github.com/tensorflow/federated#compatibility) of the `tensorflow_federated` pip package, but the Tensorflow Federated project is still in pre-release development and may not work on `master`.\n",
+        "\n",
+        "In this tutorial, we use the [EMNIST](https://www.tensorflow.org/federated/api_docs/python/tff/simulation/datasets/emnist) dataset to demonstrate how to enable lossy compression algorithms to reduce communication cost in the Federated Averaging algorithm using the `tff.learning.build_federated_averaging_process` API and the [tensor_encoding](http://jakubkonecny.com/files/tensor_encoding.pdf) API. For more details on the Federated Averaging algorithm, see the paper [Communication-Efficient Learning of Deep Networks from Decentralized Data](https://arxiv.org/abs/1602.05629)."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "qrPTFv7ngz-P"
+      },
+      "source": [
+        "## Before we start\n",
+        "\n",
+        "Before we start, please run the following to make sure that your environment is\n",
+        "correctly setup. If you don't see a greeting, please refer to the\n",
+        "[Installation](../install.md) guide for instructions."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "X_JnSqDxlw5T"
+      },
+      "outputs": [],
+      "source": [
+        "#@test {\"skip\": true}\n",
+        "!pip install --quiet --upgrade tensorflow_federated\n",
+        "!pip install --quiet --upgrade tensorflow-model-optimization\n",
+        "\n",
+        "%load_ext tensorboard"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "ctxIBpYIl846"
+      },
+      "outputs": [],
+      "source": [
+        "import functools\n",
+        "\n",
+        "import numpy as np\n",
+        "import tensorflow as tf\n",
+        "import tensorflow_federated as tff\n",
+        "\n",
+        "from tensorflow_model_optimization.python.core.internal import tensor_encoding as te"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "wj-O1cnxKHMw"
+      },
+      "source": [
+        "Verify if TFF is working."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "height": 35
+        },
+        "colab_type": "code",
+        "executionInfo": {
+          "elapsed": 500,
+          "status": "ok",
+          "timestamp": 1595017717284,
+          "user": {
+            "displayName": "",
+            "photoUrl": "",
+            "userId": ""
+          },
+          "user_tz": 420
+        },
+        "id": "8VPepVmfdhHv",
+        "outputId": "b1137927-6999-4d27-9fc9-dab194a1d586"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "b'Hello, World!'"
+            ]
+          },
+          "execution_count": 4,
+          "metadata": {
+            "tags": []
+          },
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "@tff.federated_computation\n",
+        "def hello_world():\n",
+        "  return 'Hello, World!'\n",
+        "\n",
+        "hello_world()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "30Pln72ihL-z"
+      },
+      "source": [
+        "## Preparing the input data\n",
+        "In this section we load and preprocess the EMNIST dataset included in TFF. Please check out [Federated Learning for Image Classification](https://www.tensorflow.org/federated/tutorials/federated_learning_for_image_classification#preparing_the_input_data) tutorial for more details about EMNIST dataset.\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "oTP2Dndbl2Oe"
+      },
+      "outputs": [],
+      "source": [
+        "# This value only applies to EMNIST dataset, consider choosing appropriate\n",
+        "# values if switching to other datasets.\n",
+        "MAX_CLIENT_DATASET_SIZE = 418\n",
+        "\n",
+        "CLIENT_EPOCHS_PER_ROUND = 1\n",
+        "CLIENT_BATCH_SIZE = 20\n",
+        "TEST_BATCH_SIZE = 500\n",
+        "\n",
+        "emnist_train, emnist_test = tff.simulation.datasets.emnist.load_data(\n",
+        "    only_digits=True)\n",
+        "\n",
+        "def reshape_emnist_element(element):\n",
+        "  return (tf.expand_dims(element['pixels'], axis=-1), element['label'])\n",
+        "\n",
+        "def preprocess_train_dataset(dataset):\n",
+        "  \"\"\"Preprocessing function for the EMNIST training dataset.\"\"\"\n",
+        "  return (dataset\n",
+        "          # Shuffle according to the largest client dataset\n",
+        "          .shuffle(buffer_size=MAX_CLIENT_DATASET_SIZE)\n",
+        "          # Repeat to do multiple local epochs\n",
+        "          .repeat(CLIENT_EPOCHS_PER_ROUND)\n",
+        "          # Batch to a fixed client batch size\n",
+        "          .batch(CLIENT_BATCH_SIZE, drop_remainder=False)\n",
+        "          # Preprocessing step\n",
+        "          .map(reshape_emnist_element))\n",
+        "\n",
+        "emnist_train = emnist_train.preprocess(preprocess_train_dataset)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "XUQA55yjhTGh"
+      },
+      "source": [
+        "## Defining a model\n",
+        "\n",
+        "Here we define a keras model based on the orginial FedAvg CNN, and then wrap the keras model in an instance of [tff.learning.Model](https://www.tensorflow.org/federated/api_docs/python/tff/learning/Model) so that it can be consumed by TFF.\n",
+        "\n",
+        "Note that we'll need a **function** which produces a model instead of simply a model directly. In addition, the function **cannot** just capture a pre-constructed model, it must create the model in the context that it is called. The reason is that TFF is designed to go to devices, and needs control over when resources are constructed so that they can be captured and packaged up."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "f2dLONjFnE2E"
+      },
+      "outputs": [],
+      "source": [
+        "def create_original_fedavg_cnn_model(only_digits=True):\n",
+        "  \"\"\"The CNN model used in https://arxiv.org/abs/1602.05629.\"\"\"\n",
+        "  data_format = 'channels_last'\n",
+        "\n",
+        "  max_pool = functools.partial(\n",
+        "      tf.keras.layers.MaxPooling2D,\n",
+        "      pool_size=(2, 2),\n",
+        "      padding='same',\n",
+        "      data_format=data_format)\n",
+        "  conv2d = functools.partial(\n",
+        "      tf.keras.layers.Conv2D,\n",
+        "      kernel_size=5,\n",
+        "      padding='same',\n",
+        "      data_format=data_format,\n",
+        "      activation=tf.nn.relu)\n",
+        "\n",
+        "  model = tf.keras.models.Sequential([\n",
+        "      tf.keras.layers.InputLayer(input_shape=(28, 28, 1)),\n",
+        "      conv2d(filters=32),\n",
+        "      max_pool(),\n",
+        "      conv2d(filters=64),\n",
+        "      max_pool(),\n",
+        "      tf.keras.layers.Flatten(),\n",
+        "      tf.keras.layers.Dense(512, activation=tf.nn.relu),\n",
+        "      tf.keras.layers.Dense(10 if only_digits else 62),\n",
+        "      tf.keras.layers.Softmax(),\n",
+        "  ])\n",
+        "\n",
+        "  return model\n",
+        "\n",
+        "# Gets the type information of the input data. TFF is a strongly typed\n",
+        "# functional programming framework, and needs type information about inputs to \n",
+        "# the model.\n",
+        "input_spec = emnist_train.create_tf_dataset_for_client(\n",
+        "    emnist_train.client_ids[0]).element_spec\n",
+        "\n",
+        "def tff_model_fn():\n",
+        "  keras_model = create_original_fedavg_cnn_model()\n",
+        "  return tff.learning.from_keras_model(\n",
+        "      keras_model=keras_model,\n",
+        "      input_spec=input_spec,\n",
+        "      loss=tf.keras.losses.SparseCategoricalCrossentropy(),\n",
+        "      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "ipfUaPLEhYYj"
+      },
+      "source": [
+        "## Training the model and outputting training metrics\n",
+        "\n",
+        "Now we are ready to construct a Federated Averaging algorithm and train the defined model on EMNIST dataset.\n",
+        "\n",
+        "First we need to build a Federated Averaging algorithm using the [tff.learning.build_federated_averaging_process](https://www.tensorflow.org/federated/api_docs/python/tff/learning/build_federated_averaging_process) API."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "SAsGGkL9nHEl"
+      },
+      "outputs": [],
+      "source": [
+        "federated_averaging = tff.learning.build_federated_averaging_process(\n",
+        "    model_fn=tff_model_fn,\n",
+        "    client_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=0.02),\n",
+        "    server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=1.0))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "Mn1FAPQ32FcV"
+      },
+      "source": [
+        "Now let's run the Federated Averaging algorithm. The execution of a Federated Learning algorithm from the perspective of TFF looks like this:\n",
+        "\n",
+        "1. Initialize the algorithm and get the inital server state. The server state contains necessary information to perform the algorithm. Recall, since TFF is functional, that this state includes both any optimizer state the algorithm uses (e.g. momentum terms) as well as the model parameters themselves--these will be passed as arguments and returned as results from TFF computations.\n",
+        "2. Execute the algorithm round by round. In each round, a new server state will be returned as the result of each client training the model on its data. Typically in one round:\n",
+        "    1. Server broadcast the model to all the participating clients.\n",
+        "    2. Each client perform work based on the model and its own data.\n",
+        "    3. Server aggregates all the model to produce a sever state which contains a new model.\n",
+        "\n",
+        "For more details, please see [Custom Federated Algorithms, Part 2: Implementing Federated Averaging](https://www.tensorflow.org/federated/tutorials/custom_federated_algorithms_2) tutorial.\n",
+        "\n",
+        "Training metrics are written to the Tensorboard directory for displaying after the training."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "cellView": "form",
+        "colab": {},
+        "colab_type": "code",
+        "id": "t5n9fXsGOO6-"
+      },
+      "outputs": [],
+      "source": [
+        "#@title Load utility functions\n",
+        "\n",
+        "def format_size(size):\n",
+        "  \"\"\"A helper function for creating a human-readable size.\"\"\"\n",
+        "  size = float(size)\n",
+        "  for unit in ['B','KiB','MiB','GiB']:\n",
+        "    if size \u003c 1024.0:\n",
+        "      return \"{size:3.2f}{unit}\".format(size=size, unit=unit)\n",
+        "    size /= 1024.0\n",
+        "  return \"{size:.2f}{unit}\".format(size=size, unit='TiB')\n",
+        "\n",
+        "def set_sizing_environment():\n",
+        "  \"\"\"Creates an environment that contains sizing information.\"\"\"\n",
+        "  # Creates a sizing executor factory to output communication cost\n",
+        "  # after the training finishes. Note that sizing executor only provides an\n",
+        "  # estimate (not exact) of communication cost, and doesn't capture cases like\n",
+        "  # compression of over-the-wire representations. However, it's perfect for\n",
+        "  # demonstrating the effect of compression in this tutorial.\n",
+        "  sizing_factory = tff.framework.sizing_executor_factory()\n",
+        "\n",
+        "  # TFF has a modular runtime you can configure yourself for various\n",
+        "  # environments and purposes, and this example just shows how to configure one\n",
+        "  # part of it to report the size of things.\n",
+        "  context = tff.framework.ExecutionContext(executor_fn=sizing_factory)\n",
+        "  tff.framework.set_default_context(context)\n",
+        "\n",
+        "  return sizing_factory"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "jvH6qIgynI8S"
+      },
+      "outputs": [],
+      "source": [
+        "def train(federated_averaging_process, num_rounds, num_clients_per_round, summary_writer):\n",
+        "  \"\"\"Trains the federated averaging process and output metrics.\"\"\"\n",
+        "  # Create a environment to get communication cost.\n",
+        "  environment = set_sizing_environment()\n",
+        "\n",
+        "  # Initialize the Federated Averaging algorithm to get the initial server state.\n",
+        "  state = federated_averaging_process.initialize()\n",
+        "\n",
+        "  with summary_writer.as_default():\n",
+        "    for round_num in range(num_rounds):\n",
+        "      # Sample the clients parcitipated in this round.\n",
+        "      sampled_clients = np.random.choice(\n",
+        "          emnist_train.client_ids,\n",
+        "          size=num_clients_per_round,\n",
+        "          replace=False)\n",
+        "      # Create a list of `tf.Dataset` instances from the data of sampled clients.\n",
+        "      sampled_train_data = [\n",
+        "          emnist_train.create_tf_dataset_for_client(client)\n",
+        "          for client in sampled_clients\n",
+        "      ]\n",
+        "      # Round one round of the algorithm based on the server state and client data\n",
+        "      # and output the new state and metrics.\n",
+        "      state, metrics = federated_averaging_process.next(state, sampled_train_data)\n",
+        "\n",
+        "      # For more about size_info, please see https://www.tensorflow.org/federated/api_docs/python/tff/framework/SizeInfo\n",
+        "      size_info = environment.get_size_info()\n",
+        "      broadcasted_bits = size_info.broadcast_bits[-1]\n",
+        "      aggregated_bits = size_info.aggregate_bits[-1]\n",
+        "\n",
+        "      print('round {:2d}, metrics={}, broadcasted_bits={}, aggregated_bits={}'.format(round_num, metrics, format_size(broadcasted_bits), format_size(aggregated_bits)))\n",
+        "\n",
+        "      # Add metrics to Tensorboard.\n",
+        "      for name, value in metrics['train']._asdict().items():\n",
+        "          tf.summary.scalar(name, value, step=round_num)\n",
+        "\n",
+        "      # Add broadcasted and aggregated data size to Tensorboard.\n",
+        "      tf.summary.scalar('cumulative_broadcasted_bits', broadcasted_bits, step=round_num)\n",
+        "      tf.summary.scalar('cumulative_aggregated_bits', aggregated_bits, step=round_num)\n",
+        "      summary_writer.flush()"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "height": 210
+        },
+        "colab_type": "code",
+        "executionInfo": {
+          "elapsed": 111784,
+          "status": "ok",
+          "timestamp": 1595018403902,
+          "user": {
+            "displayName": "",
+            "photoUrl": "",
+            "userId": ""
+          },
+          "user_tz": 420
+        },
+        "id": "xp3o3QcBlqY_",
+        "outputId": "65bd7832-0db8-4c6f-a6bd-e44286635c04"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "round  0, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.09433962404727936,loss=2.3181073665618896\u003e\u003e, broadcasted_bits=507.62MiB, aggregated_bits=507.62MiB\n",
+            "round  1, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.0765027329325676,loss=2.3148586750030518\u003e\u003e, broadcasted_bits=1015.24MiB, aggregated_bits=1015.24MiB\n",
+            "round  2, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.08872458338737488,loss=2.3089394569396973\u003e\u003e, broadcasted_bits=1.49GiB, aggregated_bits=1.49GiB\n",
+            "round  3, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.10852713137865067,loss=2.304060220718384\u003e\u003e, broadcasted_bits=1.98GiB, aggregated_bits=1.98GiB\n",
+            "round  4, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.10818713158369064,loss=2.3026843070983887\u003e\u003e, broadcasted_bits=2.48GiB, aggregated_bits=2.48GiB\n",
+            "round  5, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.10454985499382019,loss=2.300365447998047\u003e\u003e, broadcasted_bits=2.97GiB, aggregated_bits=2.97GiB\n",
+            "round  6, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.12841254472732544,loss=2.29765248298645\u003e\u003e, broadcasted_bits=3.47GiB, aggregated_bits=3.47GiB\n",
+            "round  7, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.14023210108280182,loss=2.2977216243743896\u003e\u003e, broadcasted_bits=3.97GiB, aggregated_bits=3.97GiB\n",
+            "round  8, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.15060241520404816,loss=2.29490327835083\u003e\u003e, broadcasted_bits=4.46GiB, aggregated_bits=4.46GiB\n",
+            "round  9, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.13088512420654297,loss=2.2942349910736084\u003e\u003e, broadcasted_bits=4.96GiB, aggregated_bits=4.96GiB\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Clean the log directory to avoid conflicts.\n",
+        "!rm -R /tmp/logs/scalars/*\n",
+        "\n",
+        "# Set up the log directory and writer for Tensorboard.\n",
+        "logdir = \"/tmp/logs/scalars/original/\"\n",
+        "summary_writer = tf.summary.create_file_writer(logdir)\n",
+        "\n",
+        "train(federated_averaging_process=federated_averaging, num_rounds=10,\n",
+        "      num_clients_per_round=10, summary_writer=summary_writer)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "zwdpTySt7pGQ"
+      },
+      "source": [
+        "Start TensorBoard with the root log directory specified above to display the training metrics. It can take a few seconds for the data to load. Except for Loss and Accuracy, we also output the amount of broadcasted and aggregated data. Broadcasted data refers to tensors the server pushes to each client while aggregated data refers to tensors each client returns to the server."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "EJ9XQiL-7e1i"
+      },
+      "outputs": [],
+      "source": [
+        "%tensorboard --logdir /tmp/logs/scalars/ --port=0"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "rY5tWN_5ht6-"
+      },
+      "source": [
+        "## Build a custom broadcast and aggregate function\n",
+        "\n",
+        "Now let's implement function to use lossy compression algorithms on broadcasted data and aggregated data using the [tensor_encoding](http://jakubkonecny.com/files/tensor_encoding.pdf) API.\n",
+        "\n",
+        "First, we define two functions:\n",
+        "* `broadcast_encoder_fn` which creates an instance of [te.core.SimpleEncoder](https://github.com/tensorflow/model-optimization/blob/ee53c9a9ae2e18ac1e443842b0b96229f0afb6d6/tensorflow_model_optimization/python/core/internal/tensor_encoding/core/simple_encoder.py#L30) to encode tensors or variables in server to client communication (Broadcast data).\n",
+        "* `mean_encoder_fn` which creates an instance of [te.core.GatherEncoder](https://github.com/tensorflow/model-optimization/blob/ee53c9a9ae2e18ac1e443842b0b96229f0afb6d6/tensorflow_model_optimization/python/core/internal/tensor_encoding/core/gather_encoder.py#L30) to encode tensors or variables in client to server communicaiton (Aggregation data).\n",
+        "\n",
+        "It is important to note that we do not apply a compression method to the entire model at once. Instead, we decide how (and whether) to compress each variable of the model independently. The reason is that generally, small variables such as biases are more sensitive to inaccuracy, and being relatively small, the potential communication savings are also relatively small. Hence we do not compress small variables by default. In this example, we apply uniform quantization to 8 bits (256 buckets) to every variable with more than 10000 elements, and only apply identity to other variables."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "lkRHkZTTnKn2"
+      },
+      "outputs": [],
+      "source": [
+        "def broadcast_encoder_fn(value):\n",
+        "  \"\"\"Function for building encoded broadcast.\"\"\"\n",
+        "  spec = tf.TensorSpec(value.shape, value.dtype)\n",
+        "  if value.shape.num_elements() \u003e 10000:\n",
+        "    return te.encoders.as_simple_encoder(\n",
+        "        te.encoders.uniform_quantization(bits=8), spec)\n",
+        "  else:\n",
+        "    return te.encoders.as_simple_encoder(te.encoders.identity(), spec)\n",
+        "\n",
+        "\n",
+        "def mean_encoder_fn(value):\n",
+        "  \"\"\"Function for building encoded mean.\"\"\"\n",
+        "  spec = tf.TensorSpec(value.shape, value.dtype)\n",
+        "  if value.shape.num_elements() \u003e 10000:\n",
+        "    return te.encoders.as_gather_encoder(\n",
+        "        te.encoders.uniform_quantization(bits=8), spec)\n",
+        "  else:\n",
+        "    return te.encoders.as_gather_encoder(te.encoders.identity(), spec)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "82iYUklQKP2e"
+      },
+      "source": [
+        "TFF provides APIs to convert the encoder function into a format that `tff.learning.build_federated_averaging_process` API can consume. By using the `tff.learning.framework.build_encoded_broadcast_from_model` and `tff.learning.framework.build_encoded_mean_from_model`, we can create two functions that can be passed into `broadcast_process` and `aggregation_process` agruments of `tff.learning.build_federated_averaging_process` to create a Federated Averaging algorithms with a lossy compression algorithm."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "aqD61hqAGZiW"
+      },
+      "outputs": [],
+      "source": [
+        "encoded_broadcast_process = (\n",
+        "    tff.learning.framework.build_encoded_broadcast_process_from_model(\n",
+        "        tff_model_fn, broadcast_encoder_fn))\n",
+        "encoded_mean_process = (\n",
+        "    tff.learning.framework.build_encoded_mean_process_from_model(\n",
+        "    tff_model_fn, mean_encoder_fn))\n",
+        "\n",
+        "federated_averaging_with_compression = tff.learning.build_federated_averaging_process(\n",
+        "    tff_model_fn,\n",
+        "    client_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=0.02),\n",
+        "    server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=1.0),\n",
+        "    broadcast_process=encoded_broadcast_process,\n",
+        "    aggregation_process=encoded_mean_process)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "v3-ADI0hjTqH"
+      },
+      "source": [
+        "## Training the model again\n",
+        "\n",
+        "Now let's run the new Federated Averaging algorithm."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "height": 210
+        },
+        "colab_type": "code",
+        "executionInfo": {
+          "elapsed": 112358,
+          "status": "ok",
+          "timestamp": 1595018799560,
+          "user": {
+            "displayName": "",
+            "photoUrl": "",
+            "userId": ""
+          },
+          "user_tz": 420
+        },
+        "id": "0KM_THYdn1yH",
+        "outputId": "d4dd3fb3-1aba-47a7-c9e6-ee1e55867b9d"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "round  0, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.08722109347581863,loss=2.3216357231140137\u003e\u003e, broadcasted_bits=146.46MiB, aggregated_bits=146.46MiB\n",
+            "round  1, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.08379272371530533,loss=2.3108291625976562\u003e\u003e, broadcasted_bits=292.92MiB, aggregated_bits=292.92MiB\n",
+            "round  2, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.08834951370954514,loss=2.3074147701263428\u003e\u003e, broadcasted_bits=439.38MiB, aggregated_bits=439.39MiB\n",
+            "round  3, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.10467479377985,loss=2.305814027786255\u003e\u003e, broadcasted_bits=585.84MiB, aggregated_bits=585.85MiB\n",
+            "round  4, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.09853658825159073,loss=2.3012874126434326\u003e\u003e, broadcasted_bits=732.30MiB, aggregated_bits=732.31MiB\n",
+            "round  5, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.14904330670833588,loss=2.3005223274230957\u003e\u003e, broadcasted_bits=878.77MiB, aggregated_bits=878.77MiB\n",
+            "round  6, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.13152804970741272,loss=2.2985599040985107\u003e\u003e, broadcasted_bits=1.00GiB, aggregated_bits=1.00GiB\n",
+            "round  7, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.12392637878656387,loss=2.297102451324463\u003e\u003e, broadcasted_bits=1.14GiB, aggregated_bits=1.14GiB\n",
+            "round  8, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.13289350271224976,loss=2.2944107055664062\u003e\u003e, broadcasted_bits=1.29GiB, aggregated_bits=1.29GiB\n",
+            "round  9, metrics=\u003cbroadcast=\u003c\u003e,aggregation=\u003c\u003e,train=\u003csparse_categorical_accuracy=0.12661737203598022,loss=2.2971296310424805\u003e\u003e, broadcasted_bits=1.43GiB, aggregated_bits=1.43GiB\n"
+          ]
+        }
+      ],
+      "source": [
+        "logdir_for_compression = \"/tmp/logs/scalars/compression/\"\n",
+        "summary_writer_for_compression = tf.summary.create_file_writer(\n",
+        "    logdir_for_compression)\n",
+        "\n",
+        "train(federated_averaging_process=federated_averaging_with_compression, \n",
+        "      num_rounds=10,\n",
+        "      num_clients_per_round=10,\n",
+        "      summary_writer=summary_writer_for_compression)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "sE8Bnjel8TIA"
+      },
+      "source": [
+        "Start TensorBoard again to compare the training metrics between two runs.\n",
+        "\n",
+        "As you can see in Tensorboard, there is a significant reduction between the `orginial` and `compression` curves in the `broadcasted_bits` and `aggregated_bits` plots while in the `loss` and `sparse_categorical_accuracy` plot the two curves are pretty similiar.\n",
+        "\n",
+        "In conclusion, we implemented a compression algorithm that can achieve similar performance as the orignial Federated Averaging algorithm while the comminucation cost is significently reduced."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {},
+        "colab_type": "code",
+        "id": "K9M2_1re28ff"
+      },
+      "outputs": [],
+      "source": [
+        "%tensorboard --logdir /tmp/logs/scalars/ --port=0"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "Jaz9_9H7NUMW"
+      },
+      "source": [
+        "## Exercises\n",
+        "\n",
+        "To implement a custom compression algorithm and apply it to the training loop,\n",
+        "you can:\n",
+        "\n",
+        "1.  Implement a new compression algorithm as a subclass of\n",
+        "    [`EncodingStageInterface`](https://github.com/tensorflow/model-optimization/blob/ee53c9a9ae2e18ac1e443842b0b96229f0afb6d6/tensorflow_model_optimization/python/core/internal/tensor_encoding/core/encoding_stage.py#L75)\n",
+        "    or its more general variant,\n",
+        "    [`AdaptiveEncodingStageInterface`](https://github.com/tensorflow/model-optimization/blob/ee53c9a9ae2e18ac1e443842b0b96229f0afb6d6/tensorflow_model_optimization/python/core/internal/tensor_encoding/core/encoding_stage.py#L274)\n",
+        "    following\n",
+        "    [this example](https://github.com/tensorflow/federated/blob/master/tensorflow_federated/python/research/compression/sparsity.py).\n",
+        "1.  Construct your new\n",
+        "    [`Encoder`](https://github.com/tensorflow/model-optimization/blob/ee53c9a9ae2e18ac1e443842b0b96229f0afb6d6/tensorflow_model_optimization/python/core/internal/tensor_encoding/core/core_encoder.py#L38)\n",
+        "    and specialize it for\n",
+        "    [model broadcast](https://github.com/tensorflow/federated/blob/master/tensorflow_federated/python/research/compression/run_experiment.py#L95)\n",
+        "    or\n",
+        "    [model update averaging](https://github.com/tensorflow/federated/blob/e67590f284b487c6b889c070a96c35b8e0341e3b/tensorflow_federated/python/research/compression/run_experiment.py#L95).\n",
+        "1.  Use those objects to build the entire\n",
+        "    [training computation](https://github.com/tensorflow/federated/blob/e67590f284b487c6b889c070a96c35b8e0341e3b/tensorflow_federated/python/research/compression/run_experiment.py#L204).\n",
+        "\n",
+        "Potentially valuable open research questions include: non-uniform quantization, lossless compression such as huffman coding, and mechanisms for adapting compression based on the information from previous training rounds.\n",
+        "\n",
+        "Recommended reading materials:\n",
+        "* [Expanding the Reach of Federated Learning by Reducing Client Resource Requirements](https://research.google/pubs/pub47774/)\n",
+        "* [Federated Learning: Strategies for Improving Communication Efficiency](https://research.google/pubs/pub45648/)\n",
+        "* _Section 3.5 Communication and Compression_ in [Advanced and Open Problems in Federated Learning](https://arxiv.org/abs/1912.04977)"
+      ]
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "collapsed_sections": [],
+      "last_runtime": {
+        "build_target": "",
+        "kind": "local"
+      },
+      "name": "TFF for Federated Learning Research: Model and Update Compression",
+      "provenance": [],
+      "toc_visible": true
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}
+%% Cell type:markdown id: tags:
+
+##### Copyright 2020 The TensorFlow Authors.
+
+%% Cell type:code id: tags:
+
+``` 
+#@title Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+```
+
+%% Cell type:markdown id: tags:
+
+# TFF for Federated Learning Research: Model and Update Compression
+
+**NOTE**: This colab has been verified to work with the [latest released version](https://github.com/tensorflow/federated#compatibility) of the `tensorflow_federated` pip package, but the Tensorflow Federated project is still in pre-release development and may not work on `master`.
+
+In this tutorial, we use the [EMNIST](https://www.tensorflow.org/federated/api_docs/python/tff/simulation/datasets/emnist) dataset to demonstrate how to enable lossy compression algorithms to reduce communication cost in the Federated Averaging algorithm using the `tff.learning.build_federated_averaging_process` API and the [tensor_encoding](http://jakubkonecny.com/files/tensor_encoding.pdf) API. For more details on the Federated Averaging algorithm, see the paper [Communication-Efficient Learning of Deep Networks from Decentralized Data](https://arxiv.org/abs/1602.05629).
+
+%% Cell type:markdown id: tags:
+
+## Before we start
+
+Before we start, please run the following to make sure that your environment is
+correctly setup. If you don't see a greeting, please refer to the
+[Installation](../install.md) guide for instructions.
+
+%% Cell type:code id: tags:
+
+``` 
+#@test {"skip": true}
+!pip install --quiet --upgrade tensorflow_federated
+!pip install --quiet --upgrade tensorflow-model-optimization
+
+%load_ext tensorboard
+```
+
+%% Cell type:code id: tags:
+
+``` 
+import functools
+
+import numpy as np
+import tensorflow as tf
+import tensorflow_federated as tff
+
+from tensorflow_model_optimization.python.core.internal import tensor_encoding as te
+```
+
+%% Cell type:markdown id: tags:
+
+Verify if TFF is working.
+
+%% Cell type:code id: tags:
+
+``` 
+@tff.federated_computation
+def hello_world():
+  return 'Hello, World!'
+
+hello_world()
+```
+
+%% Output
+
+    b'Hello, World!'
+
+%% Cell type:markdown id: tags:
+
+## Preparing the input data
+In this section we load and preprocess the EMNIST dataset included in TFF. Please check out [Federated Learning for Image Classification](https://www.tensorflow.org/federated/tutorials/federated_learning_for_image_classification#preparing_the_input_data) tutorial for more details about EMNIST dataset.
+
+
+%% Cell type:code id: tags:
+
+``` 
+# This value only applies to EMNIST dataset, consider choosing appropriate
+# values if switching to other datasets.
+MAX_CLIENT_DATASET_SIZE = 418
+
+CLIENT_EPOCHS_PER_ROUND = 1
+CLIENT_BATCH_SIZE = 20
+TEST_BATCH_SIZE = 500
+
+emnist_train, emnist_test = tff.simulation.datasets.emnist.load_data(
+    only_digits=True)
+
+def reshape_emnist_element(element):
+  return (tf.expand_dims(element['pixels'], axis=-1), element['label'])
+
+def preprocess_train_dataset(dataset):
+  """Preprocessing function for the EMNIST training dataset."""
+  return (dataset
+          # Shuffle according to the largest client dataset
+          .shuffle(buffer_size=MAX_CLIENT_DATASET_SIZE)
+          # Repeat to do multiple local epochs
+          .repeat(CLIENT_EPOCHS_PER_ROUND)
+          # Batch to a fixed client batch size
+          .batch(CLIENT_BATCH_SIZE, drop_remainder=False)
+          # Preprocessing step
+          .map(reshape_emnist_element))
+
+emnist_train = emnist_train.preprocess(preprocess_train_dataset)
+```
+
+%% Cell type:markdown id: tags:
+
+## Defining a model
+
+Here we define a keras model based on the orginial FedAvg CNN, and then wrap the keras model in an instance of [tff.learning.Model](https://www.tensorflow.org/federated/api_docs/python/tff/learning/Model) so that it can be consumed by TFF.
+
+Note that we'll need a **function** which produces a model instead of simply a model directly. In addition, the function **cannot** just capture a pre-constructed model, it must create the model in the context that it is called. The reason is that TFF is designed to go to devices, and needs control over when resources are constructed so that they can be captured and packaged up.
+
+%% Cell type:code id: tags:
+
+``` 
+def create_original_fedavg_cnn_model(only_digits=True):
+  """The CNN model used in https://arxiv.org/abs/1602.05629."""
+  data_format = 'channels_last'
+
+  max_pool = functools.partial(
+      tf.keras.layers.MaxPooling2D,
+      pool_size=(2, 2),
+      padding='same',
+      data_format=data_format)
+  conv2d = functools.partial(
+      tf.keras.layers.Conv2D,
+      kernel_size=5,
+      padding='same',
+      data_format=data_format,
+      activation=tf.nn.relu)
+
+  model = tf.keras.models.Sequential([
+      tf.keras.layers.InputLayer(input_shape=(28, 28, 1)),
+      conv2d(filters=32),
+      max_pool(),
+      conv2d(filters=64),
+      max_pool(),
+      tf.keras.layers.Flatten(),
+      tf.keras.layers.Dense(512, activation=tf.nn.relu),
+      tf.keras.layers.Dense(10 if only_digits else 62),
+      tf.keras.layers.Softmax(),
+  ])
+
+  return model
+
+# Gets the type information of the input data. TFF is a strongly typed
+# functional programming framework, and needs type information about inputs to
+# the model.
+input_spec = emnist_train.create_tf_dataset_for_client(
+    emnist_train.client_ids[0]).element_spec
+
+def tff_model_fn():
+  keras_model = create_original_fedavg_cnn_model()
+  return tff.learning.from_keras_model(
+      keras_model=keras_model,
+      input_spec=input_spec,
+      loss=tf.keras.losses.SparseCategoricalCrossentropy(),
+      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])
+```
+
+%% Cell type:markdown id: tags:
+
+## Training the model and outputting training metrics
+
+Now we are ready to construct a Federated Averaging algorithm and train the defined model on EMNIST dataset.
+
+First we need to build a Federated Averaging algorithm using the [tff.learning.build_federated_averaging_process](https://www.tensorflow.org/federated/api_docs/python/tff/learning/build_federated_averaging_process) API.
+
+%% Cell type:code id: tags:
+
+``` 
+federated_averaging = tff.learning.build_federated_averaging_process(
+    model_fn=tff_model_fn,
+    client_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=0.02),
+    server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=1.0))
+```
+
+%% Cell type:markdown id: tags:
+
+Now let's run the Federated Averaging algorithm. The execution of a Federated Learning algorithm from the perspective of TFF looks like this:
+
+1. Initialize the algorithm and get the inital server state. The server state contains necessary information to perform the algorithm. Recall, since TFF is functional, that this state includes both any optimizer state the algorithm uses (e.g. momentum terms) as well as the model parameters themselves--these will be passed as arguments and returned as results from TFF computations.
+2. Execute the algorithm round by round. In each round, a new server state will be returned as the result of each client training the model on its data. Typically in one round:
+    1. Server broadcast the model to all the participating clients.
+    2. Each client perform work based on the model and its own data.
+    3. Server aggregates all the model to produce a sever state which contains a new model.
+
+For more details, please see [Custom Federated Algorithms, Part 2: Implementing Federated Averaging](https://www.tensorflow.org/federated/tutorials/custom_federated_algorithms_2) tutorial.
+
+Training metrics are written to the Tensorboard directory for displaying after the training.
+
+%% Cell type:code id: tags:
+
+``` 
+#@title Load utility functions
+
+def format_size(size):
+  """A helper function for creating a human-readable size."""
+  size = float(size)
+  for unit in ['B','KiB','MiB','GiB']:
+    if size < 1024.0:
+      return "{size:3.2f}{unit}".format(size=size, unit=unit)
+    size /= 1024.0
+  return "{size:.2f}{unit}".format(size=size, unit='TiB')
+
+def set_sizing_environment():
+  """Creates an environment that contains sizing information."""
+  # Creates a sizing executor factory to output communication cost
+  # after the training finishes. Note that sizing executor only provides an
+  # estimate (not exact) of communication cost, and doesn't capture cases like
+  # compression of over-the-wire representations. However, it's perfect for
+  # demonstrating the effect of compression in this tutorial.
+  sizing_factory = tff.framework.sizing_executor_factory()
+
+  # TFF has a modular runtime you can configure yourself for various
+  # environments and purposes, and this example just shows how to configure one
+  # part of it to report the size of things.
+  context = tff.framework.ExecutionContext(executor_fn=sizing_factory)
+  tff.framework.set_default_context(context)
+
+  return sizing_factory
+```
+
+%% Cell type:code id: tags:
+
+``` 
+def train(federated_averaging_process, num_rounds, num_clients_per_round, summary_writer):
+  """Trains the federated averaging process and output metrics."""
+  # Create a environment to get communication cost.
+  environment = set_sizing_environment()
+
+  # Initialize the Federated Averaging algorithm to get the initial server state.
+  state = federated_averaging_process.initialize()
+
+  with summary_writer.as_default():
+    for round_num in range(num_rounds):
+      # Sample the clients parcitipated in this round.
+      sampled_clients = np.random.choice(
+          emnist_train.client_ids,
+          size=num_clients_per_round,
+          replace=False)
+      # Create a list of `tf.Dataset` instances from the data of sampled clients.
+      sampled_train_data = [
+          emnist_train.create_tf_dataset_for_client(client)
+          for client in sampled_clients
+      ]
+      # Round one round of the algorithm based on the server state and client data
+      # and output the new state and metrics.
+      state, metrics = federated_averaging_process.next(state, sampled_train_data)
+
+      # For more about size_info, please see https://www.tensorflow.org/federated/api_docs/python/tff/framework/SizeInfo
+      size_info = environment.get_size_info()
+      broadcasted_bits = size_info.broadcast_bits[-1]
+      aggregated_bits = size_info.aggregate_bits[-1]
+
+      print('round {:2d}, metrics={}, broadcasted_bits={}, aggregated_bits={}'.format(round_num, metrics, format_size(broadcasted_bits), format_size(aggregated_bits)))
+
+      # Add metrics to Tensorboard.
+      for name, value in metrics['train']._asdict().items():
+          tf.summary.scalar(name, value, step=round_num)
+
+      # Add broadcasted and aggregated data size to Tensorboard.
+      tf.summary.scalar('cumulative_broadcasted_bits', broadcasted_bits, step=round_num)
+      tf.summary.scalar('cumulative_aggregated_bits', aggregated_bits, step=round_num)
+      summary_writer.flush()
+```
+
+%% Cell type:code id: tags:
+
+``` 
+# Clean the log directory to avoid conflicts.
+!rm -R /tmp/logs/scalars/*
+
+# Set up the log directory and writer for Tensorboard.
+logdir = "/tmp/logs/scalars/original/"
+summary_writer = tf.summary.create_file_writer(logdir)
+
+train(federated_averaging_process=federated_averaging, num_rounds=10,
+      num_clients_per_round=10, summary_writer=summary_writer)
+```
+
+%% Output
+
+    round  0, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.09433962404727936,loss=2.3181073665618896>>, broadcasted_bits=507.62MiB, aggregated_bits=507.62MiB
+    round  1, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.0765027329325676,loss=2.3148586750030518>>, broadcasted_bits=1015.24MiB, aggregated_bits=1015.24MiB
+    round  2, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.08872458338737488,loss=2.3089394569396973>>, broadcasted_bits=1.49GiB, aggregated_bits=1.49GiB
+    round  3, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.10852713137865067,loss=2.304060220718384>>, broadcasted_bits=1.98GiB, aggregated_bits=1.98GiB
+    round  4, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.10818713158369064,loss=2.3026843070983887>>, broadcasted_bits=2.48GiB, aggregated_bits=2.48GiB
+    round  5, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.10454985499382019,loss=2.300365447998047>>, broadcasted_bits=2.97GiB, aggregated_bits=2.97GiB
+    round  6, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.12841254472732544,loss=2.29765248298645>>, broadcasted_bits=3.47GiB, aggregated_bits=3.47GiB
+    round  7, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.14023210108280182,loss=2.2977216243743896>>, broadcasted_bits=3.97GiB, aggregated_bits=3.97GiB
+    round  8, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.15060241520404816,loss=2.29490327835083>>, broadcasted_bits=4.46GiB, aggregated_bits=4.46GiB
+    round  9, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.13088512420654297,loss=2.2942349910736084>>, broadcasted_bits=4.96GiB, aggregated_bits=4.96GiB
+
+%% Cell type:markdown id: tags:
+
+Start TensorBoard with the root log directory specified above to display the training metrics. It can take a few seconds for the data to load. Except for Loss and Accuracy, we also output the amount of broadcasted and aggregated data. Broadcasted data refers to tensors the server pushes to each client while aggregated data refers to tensors each client returns to the server.
+
+%% Cell type:code id: tags:
+
+``` 
+%tensorboard --logdir /tmp/logs/scalars/ --port=0
+```
+
+%% Cell type:markdown id: tags:
+
+## Build a custom broadcast and aggregate function
+
+Now let's implement function to use lossy compression algorithms on broadcasted data and aggregated data using the [tensor_encoding](http://jakubkonecny.com/files/tensor_encoding.pdf) API.
+
+First, we define two functions:
+* `broadcast_encoder_fn` which creates an instance of [te.core.SimpleEncoder](https://github.com/tensorflow/model-optimization/blob/ee53c9a9ae2e18ac1e443842b0b96229f0afb6d6/tensorflow_model_optimization/python/core/internal/tensor_encoding/core/simple_encoder.py#L30) to encode tensors or variables in server to client communication (Broadcast data).
+* `mean_encoder_fn` which creates an instance of [te.core.GatherEncoder](https://github.com/tensorflow/model-optimization/blob/ee53c9a9ae2e18ac1e443842b0b96229f0afb6d6/tensorflow_model_optimization/python/core/internal/tensor_encoding/core/gather_encoder.py#L30) to encode tensors or variables in client to server communicaiton (Aggregation data).
+
+It is important to note that we do not apply a compression method to the entire model at once. Instead, we decide how (and whether) to compress each variable of the model independently. The reason is that generally, small variables such as biases are more sensitive to inaccuracy, and being relatively small, the potential communication savings are also relatively small. Hence we do not compress small variables by default. In this example, we apply uniform quantization to 8 bits (256 buckets) to every variable with more than 10000 elements, and only apply identity to other variables.
+
+%% Cell type:code id: tags:
+
+``` 
+def broadcast_encoder_fn(value):
+  """Function for building encoded broadcast."""
+  spec = tf.TensorSpec(value.shape, value.dtype)
+  if value.shape.num_elements() > 10000:
+    return te.encoders.as_simple_encoder(
+        te.encoders.uniform_quantization(bits=8), spec)
+  else:
+    return te.encoders.as_simple_encoder(te.encoders.identity(), spec)
+
+
+def mean_encoder_fn(value):
+  """Function for building encoded mean."""
+  spec = tf.TensorSpec(value.shape, value.dtype)
+  if value.shape.num_elements() > 10000:
+    return te.encoders.as_gather_encoder(
+        te.encoders.uniform_quantization(bits=8), spec)
+  else:
+    return te.encoders.as_gather_encoder(te.encoders.identity(), spec)
+```
+
+%% Cell type:markdown id: tags:
+
+TFF provides APIs to convert the encoder function into a format that `tff.learning.build_federated_averaging_process` API can consume. By using the `tff.learning.framework.build_encoded_broadcast_from_model` and `tff.learning.framework.build_encoded_mean_from_model`, we can create two functions that can be passed into `broadcast_process` and `aggregation_process` agruments of `tff.learning.build_federated_averaging_process` to create a Federated Averaging algorithms with a lossy compression algorithm.
+
+%% Cell type:code id: tags:
+
+``` 
+encoded_broadcast_process = (
+    tff.learning.framework.build_encoded_broadcast_process_from_model(
+        tff_model_fn, broadcast_encoder_fn))
+encoded_mean_process = (
+    tff.learning.framework.build_encoded_mean_process_from_model(
+    tff_model_fn, mean_encoder_fn))
+
+federated_averaging_with_compression = tff.learning.build_federated_averaging_process(
+    tff_model_fn,
+    client_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=0.02),
+    server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=1.0),
+    broadcast_process=encoded_broadcast_process,
+    aggregation_process=encoded_mean_process)
+```
+
+%% Cell type:markdown id: tags:
+
+## Training the model again
+
+Now let's run the new Federated Averaging algorithm.
+
+%% Cell type:code id: tags:
+
+``` 
+logdir_for_compression = "/tmp/logs/scalars/compression/"
+summary_writer_for_compression = tf.summary.create_file_writer(
+    logdir_for_compression)
+
+train(federated_averaging_process=federated_averaging_with_compression,
+      num_rounds=10,
+      num_clients_per_round=10,
+      summary_writer=summary_writer_for_compression)
+```
+
+%% Output
+
+    round  0, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.08722109347581863,loss=2.3216357231140137>>, broadcasted_bits=146.46MiB, aggregated_bits=146.46MiB
+    round  1, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.08379272371530533,loss=2.3108291625976562>>, broadcasted_bits=292.92MiB, aggregated_bits=292.92MiB
+    round  2, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.08834951370954514,loss=2.3074147701263428>>, broadcasted_bits=439.38MiB, aggregated_bits=439.39MiB
+    round  3, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.10467479377985,loss=2.305814027786255>>, broadcasted_bits=585.84MiB, aggregated_bits=585.85MiB
+    round  4, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.09853658825159073,loss=2.3012874126434326>>, broadcasted_bits=732.30MiB, aggregated_bits=732.31MiB
+    round  5, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.14904330670833588,loss=2.3005223274230957>>, broadcasted_bits=878.77MiB, aggregated_bits=878.77MiB
+    round  6, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.13152804970741272,loss=2.2985599040985107>>, broadcasted_bits=1.00GiB, aggregated_bits=1.00GiB
+    round  7, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.12392637878656387,loss=2.297102451324463>>, broadcasted_bits=1.14GiB, aggregated_bits=1.14GiB
+    round  8, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.13289350271224976,loss=2.2944107055664062>>, broadcasted_bits=1.29GiB, aggregated_bits=1.29GiB
+    round  9, metrics=<broadcast=<>,aggregation=<>,train=<sparse_categorical_accuracy=0.12661737203598022,loss=2.2971296310424805>>, broadcasted_bits=1.43GiB, aggregated_bits=1.43GiB
+
+%% Cell type:markdown id: tags:
+
+Start TensorBoard again to compare the training metrics between two runs.
+
+As you can see in Tensorboard, there is a significant reduction between the `orginial` and `compression` curves in the `broadcasted_bits` and `aggregated_bits` plots while in the `loss` and `sparse_categorical_accuracy` plot the two curves are pretty similiar.
+
+In conclusion, we implemented a compression algorithm that can achieve similar performance as the orignial Federated Averaging algorithm while the comminucation cost is significently reduced.
+
+%% Cell type:code id: tags:
+
+``` 
+%tensorboard --logdir /tmp/logs/scalars/ --port=0
+```
+
+%% Cell type:markdown id: tags:
+
+## Exercises
+
+To implement a custom compression algorithm and apply it to the training loop,
+you can:
+
+1.  Implement a new compression algorithm as a subclass of
+    [`EncodingStageInterface`](https://github.com/tensorflow/model-optimization/blob/ee53c9a9ae2e18ac1e443842b0b96229f0afb6d6/tensorflow_model_optimization/python/core/internal/tensor_encoding/core/encoding_stage.py#L75)
+    or its more general variant,
+    [`AdaptiveEncodingStageInterface`](https://github.com/tensorflow/model-optimization/blob/ee53c9a9ae2e18ac1e443842b0b96229f0afb6d6/tensorflow_model_optimization/python/core/internal/tensor_encoding/core/encoding_stage.py#L274)
+    following
+    [this example](https://github.com/tensorflow/federated/blob/master/tensorflow_federated/python/research/compression/sparsity.py).
+1.  Construct your new
+    [`Encoder`](https://github.com/tensorflow/model-optimization/blob/ee53c9a9ae2e18ac1e443842b0b96229f0afb6d6/tensorflow_model_optimization/python/core/internal/tensor_encoding/core/core_encoder.py#L38)
+    and specialize it for
+    [model broadcast](https://github.com/tensorflow/federated/blob/master/tensorflow_federated/python/research/compression/run_experiment.py#L95)
+    or
+    [model update averaging](https://github.com/tensorflow/federated/blob/e67590f284b487c6b889c070a96c35b8e0341e3b/tensorflow_federated/python/research/compression/run_experiment.py#L95).
+1.  Use those objects to build the entire
+    [training computation](https://github.com/tensorflow/federated/blob/e67590f284b487c6b889c070a96c35b8e0341e3b/tensorflow_federated/python/research/compression/run_experiment.py#L204).
+
+Potentially valuable open research questions include: non-uniform quantization, lossless compression such as huffman coding, and mechanisms for adapting compression based on the information from previous training rounds.
+
+Recommended reading materials:
+* [Expanding the Reach of Federated Learning by Reducing Client Resource Requirements](https://research.google/pubs/pub47774/)
+* [Federated Learning: Strategies for Improving Communication Efficiency](https://research.google/pubs/pub45648/)
+* _Section 3.5 Communication and Compression_ in [Advanced and Open Problems in Federated Learning](https://arxiv.org/abs/1912.04977)