We introduce here an object which is crucial to scale to industrial Federated Learning: the Plan. It reduces dramatically the bandwidth usage, allows asynchronous schemes and give more autonomy to remote devices. The original concept of plan can be found in the paper Towards Federated Learning at Scale: System Design, but it has been adapted to our needs in the PySyft library.
A Plan is intended to store a sequence of torch operations, just like a function, but it allows to send this sequence of operations to remote workers and to keep a reference to it. This way, to compute remotely this sequence of $n$ operations on some remote input referenced through pointers, instead of sending $n$ messages you need now to send a single message with the references of the plan and the pointers. You can also provide tensors with your function (that we call state tensors) to have extended functionalities. Plans can be seen either like a function that you can send, or like a class which can also be sent and executed remotely. Hence, for high level users, the notion of plan disappears and is replaced by a magic feature which allow to send to remote workers arbitrary functions containing sequential torch functions.
One thing to notice is that the class of functions that you can transform into plans is currently limited to sequences of hooked torch operations exclusively. This excludes in particular logical structures like if
, for
and while
statements, even if we are working to have workarounds soon. To be completely precise, you can use these but the logical path you take (first if
to False and 5 loops in for
for example) in the first computation of your plan will be the one kept for all the next computations, which we want to avoid in the majority of cases.
Authors:
First let's make the official imports.
import torch
import torch.nn as nn
import torch.nn.functional as F
And than those specific to PySyft, with one important note: the local worker should not be a client worker. Non client workers can store objects and we need this ability to run a plan.
import syft as sy # import the Pysyft library
hook = sy.TorchHook(torch) # hook PyTorch ie add extra functionalities
# IMPORTANT: Local worker should not be a client worker
hook.local_worker.is_client_worker = False
server = hook.local_worker
We define remote workers or devices, to be consistent with the notions provided in the reference article. We provide them with some data.
x11 = torch.tensor([-1, 2.]).tag('input_data')
x12 = torch.tensor([1, -2.]).tag('input_data2')
x21 = torch.tensor([-1, 2.]).tag('input_data')
x22 = torch.tensor([1, -2.]).tag('input_data2')
device_1 = sy.VirtualWorker(hook, id="device_1", data=(x11, x12))
device_2 = sy.VirtualWorker(hook, id="device_2", data=(x21, x22))
devices = device_1, device_2
Let's define a function that we want to transform into a plan. To do so, it's as simple as adding a decorator above the function definition!
@sy.func2plan()
def plan_double_abs(x):
x = x + x
x = torch.abs(x)
return x
Let's check, yes we have now a plan!
plan_double_abs
To use a plan, you need two things: to build the plan (ie register the sequence of operations present in the function) and to send it to a worker / device. Fortunately you can do this very easily!
To build a plan you just need to call it on some data.
Let's first get a reference to some remote data: a request is sent over the network and a reference pointer is returned.
pointer_to_data = device_1.search('input_data')[0]
pointer_to_data
If we tell the plan it must be executed remotely on the devicelocation:device_1
... we'll get an error because the plan was not built yet.
plan_double_abs.is_built
# Sending non-built Plan will fail
try:
plan_double_abs.send(device_1)
except RuntimeError as error:
print(error)
To build a plan you just need to call build
on the plan and pass the arguments needed to execute the plan (a.k.a some data). When a plan is built all the commands are executed sequentially by the local worker, and are catched by the plan and stored in its actions
attribute!
plan_double_abs.build(torch.tensor([1., -2.]))
plan_double_abs.is_built
If we try to send the plan now it works!
# This cell is executed successfully
pointer_plan = plan_double_abs.send(device_1)
pointer_plan
As with then tensors, we get a pointer to the object sent. Here it is simply called a PointerPlan
.
One important thing to remember is that when a plan is built we pre-set ahead of computation the id(s) where the result(s) should be stored. This will allow to send commands asynchronously, to already have a reference to a virtual result and to continue local computations without waiting for the remote result to be computed. One major application is when you require computation of a batch on device_1 and don't want to wait for this computation to end to launch another batch computation on device_2.
We can now remotely run the plan by calling the pointer to the plan with a pointer to some data. This issues a command to run this plan remotely, so that the predefined location of the output of the plan now contains the result (remember we pre-set location of result ahead of computation). This also requires a single communication round.
The result is simply a pointer, just like when you call an usual hooked torch function!
pointer_to_result = pointer_plan(pointer_to_data)
print(pointer_to_result)
And you can simply ask the value back.
pointer_to_result.get()
But what we want to do is to apply Plan to deep and federated learning, right? So let's look to a slightly more complicated example, using neural networks as you might be willing to use them. Note that we are now transforming a class into a plan. To do so, we inherit our class from sy.Plan (instead of inheriting from nn.Module).
class Net(sy.Plan):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(2, 3)
self.fc2 = nn.Linear(3, 2)
def forward(self, x):
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=0)
net = Net()
net
Let's build the plan using some mock data.
net.build(torch.tensor([1., 2.]))
We now send the plan to a remote worker
pointer_to_net = net.send(device_1)
pointer_to_net
Let's retrieve some remote data
pointer_to_data = device_1.search('input_data')[0]
Then, the syntax is just like normal remote sequential execution, that is, just like local execution. But compared to classic remote execution, there is a single communication round for each execution.
pointer_to_result = pointer_to_net(pointer_to_data)
pointer_to_result
And we get the result as usual!
pointer_to_result.get()
Et voilà! We have seen how to dramatically reduce the communication between the local worker (or server) and the remote devices!
One major feature that we want to have is to use the same plan for several workers, that we would change depending on the remote batch of data we are considering. In particular, we don't want to rebuild the plan each time we change of worker. Let's see how we do this, using the previous example with our small network.
class Net(sy.Plan):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(2, 3)
self.fc2 = nn.Linear(3, 2)
def forward(self, x):
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=0)
net = Net()
# Build plan
net.build(torch.tensor([1., 2.]))
Here are the main steps we just executed
pointer_to_net_1 = net.send(device_1)
pointer_to_data = device_1.search('input_data')[0]
pointer_to_result = pointer_to_net_1(pointer_to_data)
pointer_to_result.get()
And actually you can build other PointerPlans from the same plan, so the syntax is the same to run remotely a plan on another device
pointer_to_net_2 = net.send(device_2)
pointer_to_data = device_2.search('input_data')[0]
pointer_to_result = pointer_to_net_2(pointer_to_data)
pointer_to_result.get()
Note: Currently, with Plan classes, you can only use a single method and you have to name it "forward".
For functions (@
sy.func2plan
) we can automatically build the plan with no need to explicitly calling build
, actually in the moment of creation the plan is already built.
To get this functionality the only thing you need to change when creating a plan is setting an argument to the decorator called args_shape
which should be a list containing the shapes of each argument.
@sy.func2plan(args_shape=[(-1, 1)])
def plan_double_abs(x):
x = x + x
x = torch.abs(x)
return x
plan_double_abs.is_built
The args_shape
parameter is used internally to create mock tensors with the given shape which are used to build the plan.
@sy.func2plan(args_shape=[(1, 2), (-1, 2)])
def plan_sum_abs(x, y):
s = x + y
return torch.abs(s)
plan_sum_abs.is_built
You can also provide state elements to functions!
@sy.func2plan(args_shape=[(1,)], state=(torch.tensor([1]), ))
def plan_abs(x, state):
bias, = state.read()
x = x.abs()
return x + bias
pointer_plan = plan_abs.send(device_1)
x_ptr = torch.tensor([-1, 0]).send(device_1)
p = pointer_plan(x_ptr)
p.get()
To learn more about this, you can discover how we use Plans with Protocols in Tutorial Part 8 bis!
The easiest way to help our community is just by starring the repositories! This helps raise awareness of the cool tools we're building.
We made really nice tutorials to get a better understanding of what Federated and Privacy-Preserving Learning should look like and how we are building the bricks for this to happen.
The best way to keep up to date on the latest advancements is to join our community!
The best way to contribute to our community is to become a code contributor! If you want to start "one off" mini-projects, you can go to PySyft GitHub Issues page and search for issues marked Good First Issue
.
If you don't have time to contribute to our codebase, but would still like to lend support, you can also become a Backer on our Open Collective. All donations go toward our web hosting and other community expenses such as hackathons and meetups!