Skip to content
Snippets Groups Projects
Commit 1de91fc8 authored by Michael Reneer's avatar Michael Reneer Committed by Zachary Garrett
Browse files

Add documentation and scripts to help run simulations on GCP.

PiperOrigin-RevId: 272009268
parent a1cd25cb
No related branches found
No related tags found
No related merge requests found
Showing
with 535 additions and 20 deletions
......@@ -27,7 +27,7 @@ RUN apt-get update && apt-get install -y \
${PYTHON}-pip \
git
RUN ${PIP} --no-cache-dir install --upgrade \
RUN ${PIP} install --no-cache-dir --upgrade \
pip \
setuptools
......@@ -59,7 +59,7 @@ RUN bazel version
# TODO(b/140751117) Unpin gast from 0.2.0.
# TODO(b/141279425): Remove pinned tf-estimator-nightly version.
# Install the TensorFlow Federated development environment dependencies
RUN ${PIP} --no-cache-dir install \
RUN ${PIP} install --no-cache-dir \
absl-py~=0.7 \
attrs~=18.2 \
cachetools~=3.1.1 \
......
......@@ -30,6 +30,8 @@ upper_tabs:
path: /federated/get_started
- title: Installation
path: /federated/install
- title: GCP Setup
path: /federated/gcp_setup
- title: Federated Learning
path: /federated/federated_learning
- title: Federated Core
......
# Setup simulations with TFF on GCP
This tutorial will describe how to do the steps required for the setup
high-performance simulations with TFF on GCP.
1. [Install and initialize the Cloud SDK.](https://cloud.google.com/sdk/docs/quickstarts).
1. Run a simulation on a single runtime container.
1. Start a single runtime container.
1. [Create a Compute Engine instance](https://cloud.google.com/endpoints/docs/grpc/get-started-compute-engine-docker#create_vm).
1. `ssh` into the instance.
```shell
$ gcloud compute ssh <instance>
```
1. Run the runtime container in the background.
```shell
$ docker run \
--detach \
--name=runtime \
--publish=8000:8000 \
gcr.io/tensorflow-federated/runtime
```
1. Exit the instance.
```shell
$ exit
```
1. Get the internal IP address of the instance.
This is used later as a parameter to our test script.
```shell
$ gcloud compute instances describe <instance> \
--format='get(networkInterfaces[0].networkIP)'
```
1. Start and run a simulation on a client container.
1. [Create a Compute Engine instance](https://cloud.google.com/endpoints/docs/grpc/get-started-compute-engine-docker#create_vm).
1. Copy your experiement to the Compute Engine instance.
```shell
$ gcloud compute scp \
"tensorflow_federated/tools/client/test.py" \
<instance>:~
```
1. `ssh` into the instance.
```shell
$ gcloud compute ssh <instance>
```
1. Run the client container interactively.
The string "Hello World" should print to the terminal.
```shell
$ docker run \
--interactive \
--tty \
--name=client \
--volume ~:/simulation \
--workdir /simulation \
gcr.io/tensorflow-federated/client \
bash
```
1. Run the Python script.
Using the internal IP address of the instance running the runtime
container.
```shell
$ python3 test.py --host '<internal IP address>'
```
1. Exit the container.
```shell
$ exit
```
1. Exit the instance.
```shell
$ exit
```
1. Configure runtime and client images from source.
If you wanted to run a simulation using TFF from source instead of a
released version of TFF, you need to: build the runtime and client images
from source; publish those images to your own container registry; and
finally create the runtime and client containers using those images instead
of the released images that we provide. See the
[Container Registry documentation](https://cloud.google.com/container-registry/docs/)
for more information.
1. Configure a runtime image.
```shell
$ bazel run //tensorflow_federated/tools/runtime/gcp:build_image
$ bazel run //tensorflow_federated/tools/runtime/gcp:publish_image -- \
<runtime registry>
```
1. Configure a client image.
```shell
$ bazel run //tensorflow_federated/tools/client:build_image
$ bazel run //tensorflow_federated/tools/client:publish_image -- \
<client registry>
```
......@@ -170,16 +170,18 @@ macOS.
<pre class="prettyprint lang-bsh">
<code class="devsite-terminal">docker build . \
--tag tensorflow_federated:latest</code>
--tag tensorflow_federated</code>
</pre>
### 4. Start a Docker container.
<pre class="prettyprint lang-bsh">
<code class="devsite-terminal">docker run -it \
--workdir /federated \
<code class="devsite-terminal">docker run \
--interactive \
--tty \
--volume $(pwd):/federated \
tensorflow_federated:latest \
--workdir /federated \
tensorflow_federated \
bash</code>
</pre>
......
......@@ -30,3 +30,11 @@ py_proto_library(
":computation_py_pb2",
],
)
filegroup(
name = "proto_files",
srcs = [
"computation.proto",
"executor.proto",
],
)
......@@ -48,7 +48,8 @@ def run_server(executor, num_threads, port, credentials=None, options=None):
py_typecheck.check_type(executor, framework.Executor)
py_typecheck.check_type(num_threads, int)
py_typecheck.check_type(port, int)
py_typecheck.check_type(credentials, grpc.ServerCredentials)
if credentials is not None:
py_typecheck.check_type(credentials, grpc.ServerCredentials)
if num_threads < 1:
raise ValueError('The number of threads must be a positive integer.')
if port < 1:
......
......@@ -28,12 +28,17 @@ flags.DEFINE_integer('port', '8000', 'port to listen on')
flags.DEFINE_integer('threads', '10', 'number of worker threads in thread pool')
flags.DEFINE_string('private_key', '', 'the private key for SSL/TLS setup')
flags.DEFINE_string('certificate_chain', '', 'the cert for SSL/TLS setup')
flags.DEFINE_integer('clients', '1', 'number of clients to host on this worker')
flags.DEFINE_integer('fanout', '100',
'max fanout in the hierarchy of local executors')
def main(argv):
del argv
tf.compat.v1.enable_v2_behavior()
executor = framework.create_local_executor()(None)
executor_factory = framework.create_local_executor(
num_clients=FLAGS.clients, max_fanout=FLAGS.fanout)
executor = executor_factory(None)
if FLAGS.private_key:
if FLAGS.certificate_chain:
with open(FLAGS.private_key, 'rb') as f:
......
package(default_visibility = ["//visibility:private"])
licenses(["notice"]) # Apache 2.0 License
sh_binary(
name = "build_image",
srcs = ["build_image.sh"],
data = [
":dockerfile_file",
"//tensorflow_federated/tools/development:build_pip_package",
],
)
filegroup(
name = "dockerfile_file",
srcs = ["Dockerfile"],
)
sh_binary(
name = "publish_image",
srcs = ["publish_image.sh"],
)
filegroup(
name = "test",
srcs = ["test.py"],
tags = ["ignore_srcs"],
)
# Copyright 2019, The TensorFlow Federated Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
FROM gcr.io/google_appengine/python
RUN python3 --version
COPY "tensorflow_federated-"*".whl" /
RUN pip3 install --no-cache-dir --upgrade \
pip \
setuptools
RUN pip3 install --no-cache-dir --upgrade \
"/tensorflow_federated-"*".whl"
RUN pip3 freeze
#!/usr/bin/env bash
# Copyright 2019, The TensorFlow Federated Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Tool to build the TensorFlow Federated client image.
#
# Usage:
# bazel run //tensorflow_federated/tools/client:build_image
#
# Arguments:
# artifacts_dir: A directory to use when generating intermediate artifacts,
# which can be useful during debugging. If no directory is specified, a
# temproary directory will be used and cleaned up when this command exits.
set -e
main() {
local artifacts_dir="$2"
if [[ -z "${artifacts_dir}" ]]; then
artifacts_dir="$(mktemp -d)"
trap "rm -rf ${artifacts_dir}" EXIT
fi
cp -LR "tensorflow_federated" "${artifacts_dir}"
pushd "${artifacts_dir}"
# Build the TensorFlow Federated package
tensorflow_federated/tools/development/build_pip_package \
"${artifacts_dir}"
# Build the TensorFlow Federated runtime image
docker build \
--file "tensorflow_federated/tools/client/Dockerfile" \
--tag tff-client \
.
}
main "$@"
#!/usr/bin/env bash
# Copyright 2019, The TensorFlow Federated Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Tool to publish the TensorFlow Federated client image to GCP.
#
# Usage:
# bazel run //tensorflow_federated/tools/client:publish_image
#
# Arguments:
# registry: A name of a container registry.
set -e
die() {
echo >&2 "$@"
exit 1
}
main() {
local registry="$1"
if [[ -z "${registry}" ]]; then
die "A registry was not specified."
fi
docker tag tff-client "${registry}"
docker push "${registry}"
}
main "$@"
# Lint as: python3
# Copyright 2019, The TensorFlow Federated Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Script for testing a remote executor on GCP."""
from absl import app
from absl import flags
import grpc
import tensorflow as tf
import tensorflow_federated as tff
tf.compat.v1.enable_v2_behavior()
FLAGS = flags.FLAGS
flags.DEFINE_string('host', None, 'The host to connect to.')
flags.mark_flag_as_required('host')
flags.DEFINE_string('port', '8000', 'The port to connect to.')
def main(argv):
if len(argv) > 1:
raise app.UsageError('Too many command-line arguments.')
channel = grpc.insecure_channel('{}:{}'.format(FLAGS.host, FLAGS.port))
remote_executor = tff.framework.RemoteExecutor(channel)
caching_executor = tff.framework.CachingExecutor(remote_executor)
lambda_executor = tff.framework.LambdaExecutor(caching_executor)
tff.framework.set_default_executor(lambda_executor)
print(tff.federated_computation(lambda: 'Hello World')())
if __name__ == '__main__':
app.run(main)
package(default_visibility = ["//visibility:private"])
package(default_visibility = ["//tensorflow_federated/tools:__subpackages__"])
licenses(["notice"])
......@@ -10,7 +10,7 @@ sh_binary(
name = "build_pip_package",
srcs = ["build_pip_package.sh"],
data = [
":pip_package_files",
":setup",
"//tensorflow_federated",
"//tensorflow_federated/proto",
"//tensorflow_federated/proto/v0",
......@@ -24,7 +24,7 @@ sh_binary(
)
filegroup(
name = "pip_package_files",
name = "setup",
srcs = ["setup.py"],
tags = ["ignore_srcs"],
)
......
package(default_visibility = ["//visibility:private"])
licenses(["notice"]) # Apache 2.0 License
sh_binary(
name = "build_image",
srcs = ["build_image.sh"],
data = [
":dockerfile_file",
"//tensorflow_federated/python/simulation:worker",
"//tensorflow_federated/tools/development:build_pip_package",
],
)
filegroup(
name = "dockerfile_file",
srcs = ["Dockerfile"],
)
sh_binary(
name = "deploy_endpoint",
srcs = ["deploy_endpoint.sh"],
data = [
":worker_configuration_file",
"//tensorflow_federated/proto/v0:proto_files",
],
)
sh_binary(
name = "publish_image",
srcs = ["publish_image.sh"],
)
filegroup(
name = "worker_configuration_file",
srcs = ["worker.yaml"],
)
......@@ -13,18 +13,19 @@
# limitations under the License.
FROM gcr.io/google_appengine/python
RUN virtualenv -p python3.6 /env
RUN python3 --version
ENV VIRTUAL_ENV /env
ENV PATH /env/bin:$PATH
COPY "tensorflow_federated-"*".whl" /
COPY "tensorflow_federated/python/simulation/worker.py" /
ADD requirements.txt /
ADD tensorflow_federated/python/simulation/worker.py /
RUN pip3 install --no-cache-dir --upgrade \
pip \
setuptools
RUN python3 -m pip install -r /requirements.txt
RUN python3 -m pip --no-cache-dir install tensorflow_federated
RUN pip freeze
RUN pip3 install --no-cache-dir --upgrade \
"/tensorflow_federated-"*".whl"
RUN pip3 freeze
EXPOSE 8000
ENTRYPOINT ["python", "/worker.py"]
ENTRYPOINT ["python3", "/worker.py"]
#!/usr/bin/env bash
# Copyright 2019, The TensorFlow Federated Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Tool to build the TensorFlow Federated runtime image.
#
# Usage:
# bazel run //tensorflow_federated/tools/runtime/gcp:build_image
#
# Arguments:
# artifacts_dir: A directory to use when generating intermediate artifacts,
# which can be useful during debugging. If no directory is specified, a
# temproary directory will be used and cleaned up when this command exits.
set -e
main() {
local artifacts_dir="$1"
if [[ -z "${artifacts_dir}" ]]; then
artifacts_dir="$(mktemp -d)"
trap "rm -rf ${artifacts_dir}" EXIT
fi
cp -LR "tensorflow_federated" "${artifacts_dir}"
pushd "${artifacts_dir}"
# Build the TensorFlow Federated package
tensorflow_federated/tools/development/build_pip_package \
"${artifacts_dir}"
# Build the TensorFlow Federated runtime image
docker build \
--file "tensorflow_federated/tools/runtime/gcp/Dockerfile" \
--tag tff-runtime \
.
}
main "$@"
#!/usr/bin/env bash
# Copyright 2019, The TensorFlow Federated Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Tool to deploy the TensorFlow Federated endpoint on GCP.
#
# Usage:
# bazel run //tensorflow_federated/tools/runtime/gcp:deploy_endpoint
set -e
main() {
local artifacts_dir="$1"
if [[ -z "${artifacts_dir}" ]]; then
local artifacts_dir="$(mktemp -d)"
trap "rm -rf ${artifacts_dir}" EXIT
fi
cp -LR "tensorflow_federated" "${artifacts_dir}"
pushd "${artifacts_dir}"
# Create a virtual environment
virtualenv --python=python3 "venv"
source "venv/bin/activate"
pip install --upgrade pip
# Install gRPC
pip install --upgrade grpcio grpcio-tools
# Create the descriptor file
mkdir "generated_pb2"
python -m grpc_tools.protoc \
--include_imports \
--include_source_info \
--proto_path=. \
--descriptor_set_out="api_descriptor.pb" \
--python_out="generated_pb2" \
--grpc_python_out="generated_pb2" \
"tensorflow_federated/proto/v0/executor.proto"
# Deploy the Endpoints configuration
gcloud endpoints services deploy \
"api_descriptor.pb" \
"tensorflow_federated/tools/runtime/gcp/worker.yaml"
}
main "$@"
#!/usr/bin/env bash
# Copyright 2019, The TensorFlow Federated Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Tool to publish the TensorFlow Federated runtime image to GCP.
#
# Usage:
# bazel run //tensorflow_federated/tools/runtime/gcp:publish_image
# Arguments:
# registry: A name of a container registry.
set -e
die() {
echo >&2 "$@"
exit 1
}
main() {
local registry="$1"
if [[ -z "${registry}" ]]; then
die "A registry was not specified."
fi
docker tag tff-runtime "${registry}"
docker push "${registry}"
}
main "$@"
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment