From 0462f9c1fd9bfc9b4d25f2161dafa8bc5d7a317e Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Wed, 31 Jul 2019 11:15:26 +0200 Subject: [PATCH 01/29] docs(README): update README --- README.md | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 70 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 34bf7587..ea766dca 100644 --- a/README.md +++ b/README.md @@ -22,6 +22,11 @@ At startup, the agent will communicate with the Docker node it is deployed on vi This implementation is using *serf* to form a cluster over a network, each agent requires an address where it will advertise its ability to be part of a cluster and a join address where it will be able to reach other agents. +The agent retrieves the IP address it can use to create +a cluster by inspecting the Docker networks associated to the agent container. If multiple networks are available, it will pickup the first network available and retrieve the IP address inside this network. + +Note: Be careful when deploying the agent to not deploy it inside the Swarm ingress network (by not using `mode=host` when exposing ports). This could lead the agent to not be able to create a cluster correctly if picking the IP address inside the ingress network. + ### Proxy The agent works as a proxy to the Docker API on which it is deployed as well as a proxy to the other agents inside the cluster. @@ -72,6 +77,10 @@ The agent also exposes the following endpoints: * `/browse/put` (*POST*): Upload a file under a specific path on the filesytem * `/host/info` (*GET*): Get information about the underlying host system * `/ping` (*GET*): Returns a 204. Public endpoint that do not require any form of authentication +* `/key` (*GET*): Returns the Edge key associated to the agent **only available when agent is started in Edge mode** +* `/key` (*POST*): Set the Edge key on this agent **only available when agent is started in Edge mode** +* `/websocket/attach` (*GET*): Websocket attach endpoint (for container console usage) +* `/websocket/exec` (*GET*): Websocket exec endpoint (for container console usage) Note: The `/browse/*` endpoints can be used to manage a filesystem. By default, it allows manipulation of files in Docker volumes (available under `/var/run/docker/volumes` when bind-mounted in the agent container) but can also manipulate files anywhere on the filesystem. To enable global filesystem manipulation support for these endpoints, the `CAP_HOST_MANAGEMENT` environment variable must be set to `1`. @@ -80,7 +89,55 @@ filesystem manipulation support for these endpoints, the `CAP_HOST_MANAGEMENT` e The agent API version is exposed via the `Portainer-Agent-API-Version` in each response of the agent. -## Security +## Using the agent in Edge mode + +The following information is only relevant for an Agent that was started in Edge mode. + +### Purpose + +The Edge mode is mainly used in the case of your remote environment being not in the same network as your Portainer instance. When started in Edge mode, the agent will reach out to the Portainer instance +and will take care of creating a reverse tunnel allowing the Portainer instance to query it. It uses a token (Edge key) that contains the required information to connect to a specific Portainer instance. + +### Startup + +To start an agent in Edge mode, the `EDGE=1` environment variable must be set. + +Upon startup, the agent will try to retrieve an existing Edge key in the following order: + +* from the environment variables via the `EDGE_KEY` environment variable +* from the filesystem +* from the cluster (if joining an existing Edge agent cluster) + +If no Edge key was retrieved, the agent will start a HTTP server where it will expose a UI to associate an Edge key. + +For security reasons, the Edge server UI will shutdown after 15 minutes if no key has been specified. The agent will require a restart in order +to access the Edge UI again. + + +### Edge key + +The Edge key is used by the agent to connect to a specific Portainer instance. It is encoded using base64 and contains the following information: + +* Portainer instance API URL +* Portainer instance tunnel server address +* Portainer instance tunnel server fingerprint +* Endpoint identifier + +This information is represented in the following format before encoding (single string using the `|` character as a separator): + +``` +portainer_instance_url|tunnel_server_addr|tunnel_server_fingerprint|endpoint_ID +``` + +The Edge key associated to an agent will be persisted on disk after association under `/data/agent_edge_key`. + +### Polling + +### Security + +## Using the agent (non Edge) + +The following information is only relevant for an Agent that was not started in Edge mode. ### Encryption @@ -132,17 +189,26 @@ This mode will allow multiple instances of Portainer to connect to a single agen Note: Due to the fact that the agent will now decode and parse the public key associated to each request, this mode might be less performant than the default mode. - ## Deployment options The behavior of the agent can be tuned via a set of mandatory and optional options available as environment variables: * AGENT_CLUSTER_ADDR (*mandatory*): address (in the IP:PORT format) of an existing agent to join the agent cluster. When deploying the agent as a Docker Swarm service, we can leverage the internal Docker DNS to automatically join existing agents or form a cluster by using `tasks.:` as the address. -* AGENT_PORT (*optional*): port on which the agent web server will listen (default to `9001`). -* CAP_HOST_MANAGEMENT (*optional*): enable advanced filesystem management features. Disabled by default, set to `1` to enable it. +* AGENT_HOST (*optional*): address on which the agent API will be exposed (default to `0.0.0.0`) +* AGENT_PORT (*optional*): port on which the agent API will be exposed (default to `9001`) +* CAP_HOST_MANAGEMENT (*optional*): enable advanced filesystem management features. Disabled by default, set to `1` to enable it * AGENT_SECRET (*optional*): shared secret used in the signature verification process * LOG_LEVEL (*optional*): defines the log output verbosity (default to `INFO`) +* EDGE (*optional*): enable Edge mode. Disabled by default, set to `1` to enable it +* EDGE_KEY (*optional*): specify an Edge key to use at startup +* EDGE_ID (*mandatory when EDGE=1*): a unique identifier associated to this agent cluster +* EDGE_SERVER_HOST (*optional*): address on which the Edge UI will be exposed (default to `0.0.0.0`) +* EDGE_SERVER_PORT (*optional*): port on which the Edge UI will be exposed (default to `80`). +* EDGE_POLL_FREQUENCY (*optional*): frequency that will be used by the agent to poll the Portainer instance (default to `5s`) +* EDGE_INACTIVITY_TIMEOUT (*optional*): timeout used by the agent to close the reverse tunnel after inactivity (default to `5m`) +* EDGE_INSECURE_POLL (*optional*): enable this option if you need the agent to poll a HTTPS Portainer instance with self-signed certificates. Disabled by default, set to `1` to enable it + For more information about deployment scenarios, see: https://portainer.readthedocs.io/en/stable/agent.html From 0985d2083ae8ca2f6d50e5dea6bc6ac2b0062404 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Thu, 1 Aug 2019 17:38:52 +0200 Subject: [PATCH 02/29] docs(README): update README --- README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index ea766dca..25782fec 100644 --- a/README.md +++ b/README.md @@ -108,7 +108,7 @@ Upon startup, the agent will try to retrieve an existing Edge key in the followi * from the filesystem * from the cluster (if joining an existing Edge agent cluster) -If no Edge key was retrieved, the agent will start a HTTP server where it will expose a UI to associate an Edge key. +If no Edge key was retrieved, the agent will start a HTTP server where it will expose a UI to associate an Edge key. After associating a key via the UI, the UI server will shutdown. For security reasons, the Edge server UI will shutdown after 15 minutes if no key has been specified. The agent will require a restart in order to access the Edge UI again. @@ -133,6 +133,10 @@ The Edge key associated to an agent will be persisted on disk after association ### Polling +After associating an Edge key to an agent, this one will start polling the associated Portainer instance. + +### Reverse tunnel + ### Security ## Using the agent (non Edge) From ec0746a06bacdfbc7190e942faa7531f68042def Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Thu, 1 Aug 2019 18:10:11 +0200 Subject: [PATCH 03/29] docs(README): update README --- README.md | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 25782fec..1f333e39 100644 --- a/README.md +++ b/README.md @@ -105,7 +105,7 @@ To start an agent in Edge mode, the `EDGE=1` environment variable must be set. Upon startup, the agent will try to retrieve an existing Edge key in the following order: * from the environment variables via the `EDGE_KEY` environment variable -* from the filesystem +* from the filesystem (see the Edge key section below for more information about key persistence on disk) * from the cluster (if joining an existing Edge agent cluster) If no Edge key was retrieved, the agent will start a HTTP server where it will expose a UI to associate an Edge key. After associating a key via the UI, the UI server will shutdown. @@ -113,7 +113,6 @@ If no Edge key was retrieved, the agent will start a HTTP server where it will e For security reasons, the Edge server UI will shutdown after 15 minutes if no key has been specified. The agent will require a restart in order to access the Edge UI again. - ### Edge key The Edge key is used by the agent to connect to a specific Portainer instance. It is encoded using base64 and contains the following information: @@ -135,6 +134,19 @@ The Edge key associated to an agent will be persisted on disk after association After associating an Edge key to an agent, this one will start polling the associated Portainer instance. +It will use the Portainer instance API URL and the endpoint identifier available in the Edge key to build the poll request URL: `http(s)://API_URL/api/endpoints/ENDPOINT_ID/status` + +The response of the poll request contains the following information: + +* Tunnel status +* Poll frequency +* Tunnel port +* Encrypted credentials +* Schedules + +The tunnel status property can take one of the following values: `IDLE`, `REQUIRED`, `ACTIVE`. When this property is set to `REQUIRED`, the agent will +create a reverse tunnel to the Portainer instance using the port specified in the response as well as the credentials. + ### Reverse tunnel ### Security From e2b870620b13f06f8517163116063a88352e65c1 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Fri, 2 Aug 2019 09:40:04 +0200 Subject: [PATCH 04/29] docs(README): update README --- README.md | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 1f333e39..9fcf5f97 100644 --- a/README.md +++ b/README.md @@ -147,9 +147,25 @@ The response of the poll request contains the following information: The tunnel status property can take one of the following values: `IDLE`, `REQUIRED`, `ACTIVE`. When this property is set to `REQUIRED`, the agent will create a reverse tunnel to the Portainer instance using the port specified in the response as well as the credentials. +Each poll request sent to the Portainer instance contains the `X-PortainerAgent-EdgeID` header (with the value set to the Edge ID associated to the agent). This is used by the Portainer instance to associate an Edge ID to an endpoint so that an agent won't be able to poll information and join an Edge cluster by re-using an existing key without knowing the Edge ID. + +To allow for pre-staged environments, this Edge ID is associated to an endpoint by Portainer after receiving the first poll request from an agent. + ### Reverse tunnel -### Security +The reverse tunnel is established by the agent. The permissions associated to the credentials are set on the Portainer instance, the credentials are valid for a management session and can only be used +to create a reverse tunnel on a specific port (the one that is specified in the poll response). + +The agent will monitor the usage of the tunnel. The tunnel will be closed in any of the following cases: + +1. The status of the tunnel specified in the poll response is equal to `IDLE` +2. If no activity has been registered on the tunnel (no requests executed against the agent API) after a specific amount of time (can be configured via `EDGE_INACTIVITY_TIMEOUT`, default to 5 minutes) + +### API server + +When deployed in Edge mode, the agent API is not exposed over HTTPS anymore (see Using the agent non Edge section below) because we're using SSH to setup an encrypted tunnel. In order to avoid potential security issues with agent deployment exposing the API port on their host, the agent won't expose the API server under 0.0.0.0. Instead, it will expose the API server on the same IP address that is used to advertise the cluster (usually, the container IP in the overlay network). + +This means that only a container deployed in the same overlay network as the agent will be able to query it. ## Using the agent (non Edge) From 571e336eb3ec1f857c15dc1f33c17a37a8af08f9 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Tue, 6 Aug 2019 18:21:42 +0200 Subject: [PATCH 05/29] chore(dep): update libhttp to version 1.1.0 --- Gopkg.lock | 6 +++--- Gopkg.toml | 2 +- dev.sh | 2 +- vendor/github.com/portainer/libhttp/error/error.go | 4 ++-- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/Gopkg.lock b/Gopkg.lock index d3b27e50..0f7d1370 100644 --- a/Gopkg.lock +++ b/Gopkg.lock @@ -315,7 +315,7 @@ version = "1.0.0" [[projects]] - digest = "1:d8d777ffc4d14552fd69f7d01a7193834817a0b0e93081947e3c17e5a37b689a" + digest = "1:5e1c756f27553794c01dfd0a925a8eeb5b70da7798f2a4f90b2cf4a52241b1f2" name = "github.com/portainer/libhttp" packages = [ "error", @@ -323,8 +323,8 @@ "response", ] pruneopts = "UT" - revision = "b69b0bcbc8399f341f118e4668d36ca7f453dccf" - version = "1.0.1" + revision = "cde6e97fcd52778d2ec707c316212d6ec7689c19" + version = "1.1.0" [[projects]] branch = "master" diff --git a/Gopkg.toml b/Gopkg.toml index fd50e098..076386ac 100644 --- a/Gopkg.toml +++ b/Gopkg.toml @@ -76,7 +76,7 @@ [[constraint]] name = "github.com/portainer/libhttp" - version = "=1.0.1" + version = "=1.1.0" [[constraint]] name = "github.com/portainer/libcrypto" diff --git a/dev.sh b/dev.sh index c833c3ae..4311983e 100755 --- a/dev.sh +++ b/dev.sh @@ -2,7 +2,7 @@ LOG_LEVEL=DEBUG CAP_HOST_MANAGEMENT=1 #Enabled by default. Change this to anything else to disable this feature -EDGE=1 +EDGE=0 TMP="/tmp" GIT_COMMIT_HASH=`git rev-parse --short HEAD` GIT_BRANCH_NAME=`git rev-parse --abbrev-ref HEAD` diff --git a/vendor/github.com/portainer/libhttp/error/error.go b/vendor/github.com/portainer/libhttp/error/error.go index 6f5726ec..a1f07bd2 100644 --- a/vendor/github.com/portainer/libhttp/error/error.go +++ b/vendor/github.com/portainer/libhttp/error/error.go @@ -19,7 +19,7 @@ type ( } errorResponse struct { - Err string `json:"err,omitempty"` + Message string `json:"message,omitempty"` Details string `json:"details,omitempty"` } ) @@ -35,7 +35,7 @@ func writeErrorResponse(rw http.ResponseWriter, err *HandlerError) { log.Printf("http error: %s (err=%s) (code=%d)\n", err.Message, err.Err, err.StatusCode) rw.Header().Set("Content-Type", "application/json") rw.WriteHeader(err.StatusCode) - json.NewEncoder(rw).Encode(&errorResponse{Err: err.Message, Details: err.Err.Error()}) + json.NewEncoder(rw).Encode(&errorResponse{Message: err.Message, Details: err.Err.Error()}) } // WriteError is a convenience function that creates a new HandlerError before calling writeErrorResponse. From 595422d954dabbbcfa9999cc8028304211135c6f Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Tue, 6 Aug 2019 18:28:39 +0200 Subject: [PATCH 06/29] fix(build-system): revert dev.sh changes --- dev.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dev.sh b/dev.sh index 4311983e..c833c3ae 100755 --- a/dev.sh +++ b/dev.sh @@ -2,7 +2,7 @@ LOG_LEVEL=DEBUG CAP_HOST_MANAGEMENT=1 #Enabled by default. Change this to anything else to disable this feature -EDGE=0 +EDGE=1 TMP="/tmp" GIT_COMMIT_HASH=`git rev-parse --short HEAD` GIT_BRANCH_NAME=`git rev-parse --abbrev-ref HEAD` From 39f88d553aebf5c019746b3119cbd16dd03c7f1e Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Tue, 13 Aug 2019 11:15:12 +0200 Subject: [PATCH 07/29] feat(docker): skip ingress network if found during IP detection --- dev.sh | 6 +++--- docker/docker.go | 10 +++++++--- 2 files changed, 10 insertions(+), 6 deletions(-) diff --git a/dev.sh b/dev.sh index c833c3ae..a949dd55 100755 --- a/dev.sh +++ b/dev.sh @@ -2,7 +2,7 @@ LOG_LEVEL=DEBUG CAP_HOST_MANAGEMENT=1 #Enabled by default. Change this to anything else to disable this feature -EDGE=1 +EDGE=0 TMP="/tmp" GIT_COMMIT_HASH=`git rev-parse --short HEAD` GIT_BRANCH_NAME=`git rev-parse --abbrev-ref HEAD` @@ -84,7 +84,7 @@ function deploy_swarm() { echo "Deployment..." - docker -H "${DOCKER_MANAGER}:2375" network create --driver overlay --attachable portainer-agent-dev-net + docker -H "${DOCKER_MANAGER}:2375" network create --driver overlay portainer-agent-dev-net docker -H "${DOCKER_MANAGER}:2375" service create --name portainer-agent-dev \ --network portainer-agent-dev-net \ -e LOG_LEVEL="${LOG_LEVEL}" \ @@ -96,7 +96,7 @@ function deploy_swarm() { --mount type=bind,src=//var/run/docker.sock,dst=/var/run/docker.sock \ --mount type=bind,src=//var/lib/docker/volumes,dst=/var/lib/docker/volumes \ --mount type=bind,src=//,dst=/host \ - --publish mode=host,target=9001,published=9001 \ + --publish target=9001,published=9001 \ --publish mode=host,published=80,target=80 \ --restart-condition none \ "${IMAGE_NAME}" diff --git a/docker/docker.go b/docker/docker.go index 5386311c..33fde0b4 100644 --- a/docker/docker.go +++ b/docker/docker.go @@ -58,9 +58,13 @@ func (service *InfoService) GetContainerIpFromDockerEngine(containerName string) return "", err } - for _, network := range containerInspect.NetworkSettings.Networks { - if network.IPAddress != "" { - log.Printf("[DEBUG] [docker] [network_count: %d] [ip_address: %s] [message: Retrieving IP address from container networks]", len(containerInspect.NetworkSettings.Networks), network.IPAddress) + if len(containerInspect.NetworkSettings.Networks) > 1 { + log.Printf("[WARN] [docker] [network_count: %d] [message: Agent container running in more than a single Docker network. This might cause communication issues.]", len(containerInspect.NetworkSettings.Networks)) + } + + for name, network := range containerInspect.NetworkSettings.Networks { + if name != "ingress" && network.IPAddress != "" { + log.Printf("[DEBUG] [docker] [ip_address: %s] [message: Retrieving IP address from container networks]", network.IPAddress) return network.IPAddress, nil } } From 95e21700d3c4f9969a51ff949bc52051a578cb2a Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Tue, 13 Aug 2019 11:17:50 +0200 Subject: [PATCH 08/29] refactor(docker): review WARN message --- docker/docker.go | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docker/docker.go b/docker/docker.go index 33fde0b4..9ce350c5 100644 --- a/docker/docker.go +++ b/docker/docker.go @@ -59,7 +59,7 @@ func (service *InfoService) GetContainerIpFromDockerEngine(containerName string) } if len(containerInspect.NetworkSettings.Networks) > 1 { - log.Printf("[WARN] [docker] [network_count: %d] [message: Agent container running in more than a single Docker network. This might cause communication issues.]", len(containerInspect.NetworkSettings.Networks)) + log.Printf("[WARN] [docker] [network_count: %d] [message: Agent container running in more than a single Docker network. This might cause communication issues]", len(containerInspect.NetworkSettings.Networks)) } for name, network := range containerInspect.NetworkSettings.Networks { From f311d0107719da015bea12682a0cd340a0962d79 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Tue, 13 Aug 2019 11:41:28 +0200 Subject: [PATCH 09/29] feat(docker): enhance valid network detection when retrieving container IP --- docker/docker.go | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/docker/docker.go b/docker/docker.go index 9ce350c5..5d34034f 100644 --- a/docker/docker.go +++ b/docker/docker.go @@ -5,6 +5,7 @@ import ( "errors" "log" + "github.com/docker/docker/api/types" "github.com/docker/docker/client" "github.com/portainer/agent" ) @@ -62,9 +63,19 @@ func (service *InfoService) GetContainerIpFromDockerEngine(containerName string) log.Printf("[WARN] [docker] [network_count: %d] [message: Agent container running in more than a single Docker network. This might cause communication issues]", len(containerInspect.NetworkSettings.Networks)) } - for name, network := range containerInspect.NetworkSettings.Networks { - if name != "ingress" && network.IPAddress != "" { - log.Printf("[DEBUG] [docker] [ip_address: %s] [message: Retrieving IP address from container networks]", network.IPAddress) + for networkName, network := range containerInspect.NetworkSettings.Networks { + networkInspect, err := cli.NetworkInspect(context.Background(), network.NetworkID, types.NetworkInspectOptions{}) + if err != nil { + return "", err + } + + if networkInspect.Ingress || networkInspect.Scope != "swarm" { + log.Printf("[DEBUG] [docker] [network_name: %s] [scope: %s] [ingress: %t] [message: Skipping invalid container network]", networkInspect.Name, networkInspect.Scope, networkInspect.Ingress) + continue + } + + if network.IPAddress != "" { + log.Printf("[DEBUG] [docker] [ip_address: %s] [network_name: %s] [message: Retrieving IP address from container network]", network.IPAddress, networkName) return network.IPAddress, nil } } From f08f4eefdd5d1caef9d7190d07990daff70f4db2 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Tue, 13 Aug 2019 11:43:22 +0200 Subject: [PATCH 10/29] feat(serf): update agent nodeName to include information about the host node name --- serf/cluster.go | 2 ++ 1 file changed, 2 insertions(+) diff --git a/serf/cluster.go b/serf/cluster.go index 6ddc8ddb..4b8ddf38 100644 --- a/serf/cluster.go +++ b/serf/cluster.go @@ -1,6 +1,7 @@ package serf import ( + "fmt" "log" "os" @@ -40,6 +41,7 @@ func (service *ClusterService) Create(advertiseAddr string, joinAddr []string) e conf := serf.DefaultConfig() conf.Init() + conf.NodeName = fmt.Sprintf("%s-%s", service.tags[agent.MemberTagKeyNodeName], conf.NodeName) conf.Tags = service.tags conf.MemberlistConfig.LogOutput = filter conf.LogOutput = filter From 0f98a5eca09740ea1ffaac0dad9d3734778ff260 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Wed, 14 Aug 2019 11:55:28 +0200 Subject: [PATCH 11/29] Update README.md Co-Authored-By: William --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 9fcf5f97..fbe60376 100644 --- a/README.md +++ b/README.md @@ -132,7 +132,7 @@ The Edge key associated to an agent will be persisted on disk after association ### Polling -After associating an Edge key to an agent, this one will start polling the associated Portainer instance. +After associating an Edge key to an agent, the agent will start polling the associated Portainer instance. It will use the Portainer instance API URL and the endpoint identifier available in the Edge key to build the poll request URL: `http(s)://API_URL/api/endpoints/ENDPOINT_ID/status` From 19678c6d6e5a1b25a1085458265e479dbbd2bd64 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Wed, 14 Aug 2019 11:55:46 +0200 Subject: [PATCH 12/29] Update README.md Co-Authored-By: William --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index fbe60376..e29a3d5c 100644 --- a/README.md +++ b/README.md @@ -25,7 +25,7 @@ ability to be part of a cluster and a join address where it will be able to reac The agent retrieves the IP address it can use to create a cluster by inspecting the Docker networks associated to the agent container. If multiple networks are available, it will pickup the first network available and retrieve the IP address inside this network. -Note: Be careful when deploying the agent to not deploy it inside the Swarm ingress network (by not using `mode=host` when exposing ports). This could lead the agent to not be able to create a cluster correctly if picking the IP address inside the ingress network. +Note: Be careful when deploying the agent to not deploy it inside the Swarm ingress network (by not using `mode=host` when exposing ports). This could lead to the agent being unable to create a cluster correctly, if picking the IP address inside the ingress network. ### Proxy From 3a70e90765d0ebb8d79e6e9a9ef51379a34e1c05 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Tue, 17 Sep 2019 15:37:02 +1200 Subject: [PATCH 13/29] feat(dep): update Gopkg.lock --- Gopkg.lock | 1 + 1 file changed, 1 insertion(+) diff --git a/Gopkg.lock b/Gopkg.lock index 0f7d1370..ccd1a696 100644 --- a/Gopkg.lock +++ b/Gopkg.lock @@ -386,6 +386,7 @@ input-imports = [ "github.com/Microsoft/go-winio", "github.com/asaskevich/govalidator", + "github.com/docker/docker/api/types", "github.com/docker/docker/client", "github.com/gorilla/mux", "github.com/gorilla/websocket", From 363e5eec97ec30d08ff350a669c6cc184df4b057 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Tue, 17 Sep 2019 15:48:49 +1200 Subject: [PATCH 14/29] feat(options): remove EDGE_POLL_FREQUENCY option --- README.md | 1 - agent.go | 1 - cmd/agent/main.go | 2 +- os/options.go | 11 ----------- 4 files changed, 1 insertion(+), 14 deletions(-) diff --git a/README.md b/README.md index e29a3d5c..658fc634 100644 --- a/README.md +++ b/README.md @@ -237,7 +237,6 @@ we can leverage the internal Docker DNS to automatically join existing agents or * EDGE_ID (*mandatory when EDGE=1*): a unique identifier associated to this agent cluster * EDGE_SERVER_HOST (*optional*): address on which the Edge UI will be exposed (default to `0.0.0.0`) * EDGE_SERVER_PORT (*optional*): port on which the Edge UI will be exposed (default to `80`). -* EDGE_POLL_FREQUENCY (*optional*): frequency that will be used by the agent to poll the Portainer instance (default to `5s`) * EDGE_INACTIVITY_TIMEOUT (*optional*): timeout used by the agent to close the reverse tunnel after inactivity (default to `5m`) * EDGE_INSECURE_POLL (*optional*): enable this option if you need the agent to poll a HTTPS Portainer instance with self-signed certificates. Disabled by default, set to `1` to enable it diff --git a/agent.go b/agent.go index dfd69198..27af5a37 100644 --- a/agent.go +++ b/agent.go @@ -14,7 +14,6 @@ type ( EdgeServerAddr string EdgeServerPort string EdgeInactivityTimeout string - EdgePollFrequency string EdgeInsecurePoll bool LogLevel string } diff --git a/cmd/agent/main.go b/cmd/agent/main.go index a92db152..4e8e3e96 100644 --- a/cmd/agent/main.go +++ b/cmd/agent/main.go @@ -80,7 +80,7 @@ func main() { operatorConfig := &tunnel.OperatorConfig{ APIServerAddr: apiServerAddr, EdgeID: options.EdgeID, - PollFrequency: options.EdgePollFrequency, + PollFrequency: agent.DefaultEdgePollInterval, InactivityTimeout: options.EdgeInactivityTimeout, InsecurePoll: options.EdgeInsecurePoll, } diff --git a/os/options.go b/os/options.go index afad73de..e362e91f 100644 --- a/os/options.go +++ b/os/options.go @@ -20,7 +20,6 @@ const ( EnvKeyEdgeID = "EDGE_ID" EnvKeyEdgeServerHost = "EDGE_SERVER_HOST" EnvKeyEdgeServerPort = "EDGE_SERVER_PORT" - EnvKeyEdgePollFrequency = "EDGE_POLL_FREQUENCY" EnvKeyEdgeInactivityTimeout = "EDGE_INACTIVITY_TIMEOUT" EnvKeyEdgeInsecurePoll = "EDGE_INSECURE_POLL" EnvKeyLogLevel = "LOG_LEVEL" @@ -42,7 +41,6 @@ func (parser *EnvOptionParser) Options() (*agent.Options, error) { EdgeID: os.Getenv(EnvKeyEdgeID), EdgeServerAddr: agent.DefaultEdgeServerAddr, EdgeServerPort: agent.DefaultEdgeServerPort, - EdgePollFrequency: agent.DefaultEdgePollInterval, EdgeInactivityTimeout: agent.DefaultEdgeSleepInterval, EdgeInsecurePoll: false, LogLevel: agent.DefaultLogLevel, @@ -97,15 +95,6 @@ func (parser *EnvOptionParser) Options() (*agent.Options, error) { options.EdgeKey = edgeKeyEnv } - edgePollIntervalEnv := os.Getenv(EnvKeyEdgePollFrequency) - if edgePollIntervalEnv != "" { - _, err := time.ParseDuration(edgePollIntervalEnv) - if err != nil { - return nil, errors.New("invalid time duration format in " + EnvKeyEdgePollFrequency + " environment variable") - } - options.EdgePollFrequency = edgePollIntervalEnv - } - edgeSleepIntervalEnv := os.Getenv(EnvKeyEdgeInactivityTimeout) if edgeSleepIntervalEnv != "" { _, err := time.ParseDuration(edgeSleepIntervalEnv) From 340ffa88867e2b34d94a297331b5d10afc00e0bf Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Tue, 17 Sep 2019 16:46:38 +1200 Subject: [PATCH 15/29] feat(docker): retrieve ClusterAddr via service name --- agent.go | 1 + cmd/agent/main.go | 41 +++++++++++++++++++---------------------- dev.sh | 1 - docker/docker.go | 21 +++++++++++++++++++++ 4 files changed, 41 insertions(+), 23 deletions(-) diff --git a/agent.go b/agent.go index dfd69198..ca6c6efa 100644 --- a/agent.go +++ b/agent.go @@ -98,6 +98,7 @@ type ( InfoService interface { GetInformationFromDockerEngine() (map[string]string, error) GetContainerIpFromDockerEngine(containerName string) (string, error) + GetServiceNameFromDockerEngine(containerName string) (string, error) } // TLSService is used to create TLS certificates to use enable HTTPS. diff --git a/cmd/agent/main.go b/cmd/agent/main.go index a92db152..fde63857 100644 --- a/cmd/agent/main.go +++ b/cmd/agent/main.go @@ -41,26 +41,37 @@ func main() { log.Println("[INFO] [main] [message: Agent running on a Swarm cluster node. Running in cluster mode]") } - if options.ClusterAddress == "" && clusterMode { - log.Fatalf("[ERROR] [main,configuration] [message: AGENT_CLUSTER_ADDR environment variable is required when deploying the agent inside a Swarm cluster]") + containerName, err := os.GetHostName() + if err != nil { + log.Fatalf("[ERROR] [main,os] [message: Unable to retrieve container name] [error: %s]", err) } - advertiseAddr, err := retrieveAdvertiseAddress(&infoService) + advertiseAddr, err := infoService.GetContainerIpFromDockerEngine(containerName) if err != nil { - log.Fatalf("[ERROR] [main,docker,os] [message: Unable to retrieve local agent IP address] [error: %s]", err) + log.Fatalf("[ERROR] [main,docker] [message: Unable to retrieve local agent IP address] [error: %s]", err) } var clusterService agent.ClusterService if clusterMode { clusterService = cluster.NewClusterService(agentTags) + clusterAddr := options.ClusterAddress + if clusterAddr == "" { + serviceName, err := infoService.GetServiceNameFromDockerEngine(containerName) + if err != nil { + log.Fatalf("[ERROR] [main,docker] [message: Unable to agent service name from Docker] [error: %s]", err) + } + + clusterAddr = fmt.Sprintf("tasks.%s", serviceName) + } + // TODO: Workaround. looks like the Docker DNS cannot find any info on tasks. // sometimes... Waiting a bit before starting the discovery (at least 3 seconds) seems to solve the problem. time.Sleep(3 * time.Second) - joinAddr, err := net.LookupIPAddresses(options.ClusterAddress) + joinAddr, err := net.LookupIPAddresses(clusterAddr) if err != nil { - log.Fatalf("[ERROR] [main,net] [host: %s] [message: Unable to retrieve a list of IP associated to the host] [error: %s]", options.ClusterAddress, err) + log.Fatalf("[ERROR] [main,net] [host: %s] [message: Unable to retrieve a list of IP associated to the host] [error: %s]", clusterAddr, err) } err = clusterService.Create(advertiseAddr, joinAddr) @@ -68,11 +79,11 @@ func main() { log.Fatalf("[ERROR] [main,cluster] [message: Unable to create cluster] [error: %s]", err) } + log.Printf("[DEBUG] [main,configuration] [agent_port: %s] [cluster_address: %s] [advertise_address: %s]", options.AgentServerPort, clusterAddr, advertiseAddr) + defer clusterService.Leave() } - log.Printf("[DEBUG] [main,configuration] [agent_port: %s] [cluster_address: %s] [advertise_address: %s]", options.AgentServerPort, options.ClusterAddress, advertiseAddr) - var tunnelOperator agent.TunnelOperator if options.EdgeMode { apiServerAddr := fmt.Sprintf("%s:%s", advertiseAddr, options.AgentServerPort) @@ -281,17 +292,3 @@ func retrieveInformationFromDockerEnvironment(infoService agent.InfoService) (ma return agentTags, nil } - -func retrieveAdvertiseAddress(infoService agent.InfoService) (string, error) { - containerName, err := os.GetHostName() - if err != nil { - return "", err - } - - advertiseAddr, err := infoService.GetContainerIpFromDockerEngine(containerName) - if err != nil { - return "", err - } - - return advertiseAddr, nil -} diff --git a/dev.sh b/dev.sh index a949dd55..66851249 100755 --- a/dev.sh +++ b/dev.sh @@ -91,7 +91,6 @@ function deploy_swarm() { -e CAP_HOST_MANAGEMENT=${CAP_HOST_MANAGEMENT} \ -e EDGE=${EDGE} \ -e EDGE_ID=${EDGE_ID} \ - -e AGENT_CLUSTER_ADDR=tasks.portainer-agent-dev \ --mode global \ --mount type=bind,src=//var/run/docker.sock,dst=/var/run/docker.sock \ --mount type=bind,src=//var/lib/docker/volumes,dst=/var/lib/docker/volumes \ diff --git a/docker/docker.go b/docker/docker.go index 5d34034f..ab639a37 100644 --- a/docker/docker.go +++ b/docker/docker.go @@ -10,6 +10,10 @@ import ( "github.com/portainer/agent" ) +const ( + serviceNameLabel = "com.docker.swarm.service.name" +) + // InfoService is a service used to retrieve information from a Docker environment. type InfoService struct{} @@ -82,3 +86,20 @@ func (service *InfoService) GetContainerIpFromDockerEngine(containerName string) return "", errors.New("unable to retrieve the address on which the agent can advertise. Check your network settings") } + +// GetServiceNameFromDockerEngine is used to return the name of the Swarm service the agent is part of. +// The service name is retrieved through container labels. +func (service *InfoService) GetServiceNameFromDockerEngine(containerName string) (string, error) { + cli, err := client.NewClientWithOpts(client.FromEnv, client.WithVersion(agent.SupportedDockerAPIVersion)) + if err != nil { + return "", err + } + defer cli.Close() + + containerInspect, err := cli.ContainerInspect(context.Background(), containerName) + if err != nil { + return "", err + } + + return containerInspect.Config.Labels[serviceNameLabel], nil +} From e2fdcc428de21e9ae2a854d39f6f4dcef36e6196 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Tue, 17 Sep 2019 17:40:37 +1200 Subject: [PATCH 16/29] feat(serf): update Serf configuration --- serf/cluster.go | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/serf/cluster.go b/serf/cluster.go index 4b8ddf38..3421cc25 100644 --- a/serf/cluster.go +++ b/serf/cluster.go @@ -4,6 +4,7 @@ import ( "fmt" "log" "os" + "time" "github.com/hashicorp/logutils" "github.com/hashicorp/serf/serf" @@ -46,6 +47,11 @@ func (service *ClusterService) Create(advertiseAddr string, joinAddr []string) e conf.MemberlistConfig.LogOutput = filter conf.LogOutput = filter conf.MemberlistConfig.AdvertiseAddr = advertiseAddr + + // Override default Serf configuration with Swarm/overlay sane defaults + conf.ReconnectInterval = 10 * time.Second + conf.ReconnectTimeout = 1 * time.Minute + log.Printf("[DEBUG] [cluster,serf] [advertise_address: %s] [join_address: %s]", advertiseAddr, joinAddr) cluster, err := serf.Create(conf) From 3f6e055763e480486358a9a3c88d106ff0687166 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Thu, 19 Sep 2019 10:07:56 +1200 Subject: [PATCH 17/29] feat(http): log failed resourceList request --- http/proxy/cluster.go | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/http/proxy/cluster.go b/http/proxy/cluster.go index ed7ecf4b..c30c52da 100644 --- a/http/proxy/cluster.go +++ b/http/proxy/cluster.go @@ -4,6 +4,7 @@ import ( "bytes" "crypto/tls" "io/ioutil" + "log" "net/http" "sync" "time" @@ -57,7 +58,8 @@ func (clusterProxy *ClusterProxy) ClusterOperation(request *http.Request, cluste for result := range dataChannel { if result.err != nil { - return nil, result.err + log.Printf("[WARN] [http,docker,cluster] [node: %s] [message: Unable to retrieve node resources for aggregation] [error: %s]", result.nodeName, result.err) + continue } for _, item := range result.responseContent { @@ -89,20 +91,20 @@ func (clusterProxy *ClusterProxy) copyAndExecuteRequest(request *http.Request, m requestCopy, err := copyRequest(request, member) if err != nil { - ch <- agentRequestResult{err: err} + ch <- agentRequestResult{err: err, nodeName: member.NodeName} return } response, err := clusterProxy.client.Do(requestCopy) if err != nil { - ch <- agentRequestResult{err: err} + ch <- agentRequestResult{err: err, nodeName: member.NodeName} return } defer response.Body.Close() data, err := responseToJSONArray(response, request.URL.Path) if err != nil { - ch <- agentRequestResult{err: err} + ch <- agentRequestResult{err: err, nodeName: member.NodeName} return } From 40d1e06b89b13027e518e850cbaf979160d15a72 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Thu, 19 Sep 2019 10:12:53 +1200 Subject: [PATCH 18/29] refactor(http): use server Config.Secured to determine TLS usage --- http/handler/docker/docker_operation.go | 4 ++-- http/handler/docker/handler.go | 5 +++-- http/handler/handler.go | 4 ++-- http/proxy/agentproxy.go | 6 ++++-- http/proxy/cluster.go | 10 ++++++---- http/proxy/proxy.go | 4 ++-- 6 files changed, 19 insertions(+), 14 deletions(-) diff --git a/http/handler/docker/docker_operation.go b/http/handler/docker/docker_operation.go index 61164458..b7616723 100644 --- a/http/handler/docker/docker_operation.go +++ b/http/handler/docker/docker_operation.go @@ -65,7 +65,7 @@ func (handler *Handler) executeOperationOnManagerNode(rw http.ResponseWriter, re if targetMember == nil { return &httperror.HandlerError{http.StatusInternalServerError, "The agent was unable to contact any other agent located on a manager node", errors.New("Unable to find an agent on any manager node")} } - proxy.AgentHTTPRequest(rw, request, targetMember) + proxy.AgentHTTPRequest(rw, request, targetMember, handler.useTLS) } return nil } @@ -81,7 +81,7 @@ func (handler *Handler) executeOperationOnNode(rw http.ResponseWriter, request * return &httperror.HandlerError{http.StatusInternalServerError, "The agent was unable to contact any other agent", errors.New("Unable to find the targeted agent")} } - proxy.AgentHTTPRequest(rw, request, targetMember) + proxy.AgentHTTPRequest(rw, request, targetMember, handler.useTLS) } return nil } diff --git a/http/handler/docker/handler.go b/http/handler/docker/handler.go index df3d0dfa..b3ac75a7 100644 --- a/http/handler/docker/handler.go +++ b/http/handler/docker/handler.go @@ -15,15 +15,16 @@ type Handler struct { clusterProxy *proxy.ClusterProxy clusterService agent.ClusterService agentTags map[string]string + useTLS bool } // NewHandler returns a new instance of Handler. // It sets the associated handle functions for all the Docker related HTTP endpoints. -func NewHandler(clusterService agent.ClusterService, agentTags map[string]string, notaryService *security.NotaryService) *Handler { +func NewHandler(clusterService agent.ClusterService, agentTags map[string]string, notaryService *security.NotaryService, useTLS bool) *Handler { h := &Handler{ Router: mux.NewRouter(), dockerProxy: proxy.NewLocalProxy(), - clusterProxy: proxy.NewClusterProxy(), + clusterProxy: proxy.NewClusterProxy(useTLS), clusterService: clusterService, agentTags: agentTags, } diff --git a/http/handler/handler.go b/http/handler/handler.go index cbca4220..d3dcd9e4 100644 --- a/http/handler/handler.go +++ b/http/handler/handler.go @@ -52,14 +52,14 @@ var dockerAPIVersionRegexp = regexp.MustCompile(`(/v[0-9]\.[0-9]*)?`) // NewHandler returns a pointer to a Handler. func NewHandler(config *Config) *Handler { - agentProxy := proxy.NewAgentProxy(config.ClusterService, config.AgentTags) + agentProxy := proxy.NewAgentProxy(config.ClusterService, config.AgentTags, config.Secured) notaryService := security.NewNotaryService(config.SignatureService, config.Secured) return &Handler{ agentHandler: httpagenthandler.NewHandler(config.ClusterService, notaryService), browseHandler: browse.NewHandler(agentProxy, notaryService, config.AgentOptions), browseHandlerV1: browse.NewHandlerV1(agentProxy, notaryService), - dockerProxyHandler: docker.NewHandler(config.ClusterService, config.AgentTags, notaryService), + dockerProxyHandler: docker.NewHandler(config.ClusterService, config.AgentTags, notaryService, config.Secured), keyHandler: key.NewHandler(config.TunnelOperator, config.ClusterService, notaryService, config.EdgeMode), webSocketHandler: websocket.NewHandler(config.ClusterService, config.AgentTags, notaryService), hostHandler: host.NewHandler(config.SystemService, agentProxy, notaryService), diff --git a/http/proxy/agentproxy.go b/http/proxy/agentproxy.go index 2abf96d1..d09dfcd5 100644 --- a/http/proxy/agentproxy.go +++ b/http/proxy/agentproxy.go @@ -12,13 +12,15 @@ import ( type AgentProxy struct { clusterService agent.ClusterService agentTags map[string]string + useTLS bool } // NewAgentProxy returns a pointer to a new AgentProxy object -func NewAgentProxy(clusterService agent.ClusterService, agentTags map[string]string) *AgentProxy { +func NewAgentProxy(clusterService agent.ClusterService, agentTags map[string]string, useTLS bool) *AgentProxy { return &AgentProxy{ clusterService: clusterService, agentTags: agentTags, + useTLS: useTLS, } } @@ -40,7 +42,7 @@ func (p *AgentProxy) Redirect(next http.Handler) http.Handler { if targetMember == nil { return &httperror.HandlerError{http.StatusInternalServerError, "The agent was unable to contact any other agent", errors.New("Unable to find the targeted agent")} } - AgentHTTPRequest(rw, r, targetMember) + AgentHTTPRequest(rw, r, targetMember, p.useTLS) } return nil }) diff --git a/http/proxy/cluster.go b/http/proxy/cluster.go index ed7ecf4b..7f0e1fae 100644 --- a/http/proxy/cluster.go +++ b/http/proxy/cluster.go @@ -16,11 +16,12 @@ const defaultClusterRequestTimeout = 120 // ClusterProxy is a service used to execute the same requests on multiple targets. type ClusterProxy struct { client *http.Client + useTLS bool } // NewClusterProxy returns a pointer to a ClusterProxy. // It also sets the default values used in the underlying http.Client. -func NewClusterProxy() *ClusterProxy { +func NewClusterProxy(useTLS bool) *ClusterProxy { tlsConfig := &tls.Config{ InsecureSkipVerify: true, } @@ -32,6 +33,7 @@ func NewClusterProxy() *ClusterProxy { TLSClientConfig: tlsConfig, }, }, + useTLS: useTLS, } } @@ -87,7 +89,7 @@ func (clusterProxy *ClusterProxy) executeRequestOnCluster(request *http.Request, func (clusterProxy *ClusterProxy) copyAndExecuteRequest(request *http.Request, member *agent.ClusterMember, ch chan agentRequestResult, wg *sync.WaitGroup) { defer wg.Done() - requestCopy, err := copyRequest(request, member) + requestCopy, err := copyRequest(request, member, clusterProxy.useTLS) if err != nil { ch <- agentRequestResult{err: err} return @@ -109,7 +111,7 @@ func (clusterProxy *ClusterProxy) copyAndExecuteRequest(request *http.Request, m ch <- agentRequestResult{err: nil, responseContent: data, nodeName: member.NodeName} } -func copyRequest(request *http.Request, member *agent.ClusterMember) (*http.Request, error) { +func copyRequest(request *http.Request, member *agent.ClusterMember, useTLS bool) (*http.Request, error) { body, err := ioutil.ReadAll(request.Body) if err != nil { return nil, err @@ -119,7 +121,7 @@ func copyRequest(request *http.Request, member *agent.ClusterMember) (*http.Requ url.Host = member.IPAddress + ":" + member.Port url.Scheme = "http" - if request.TLS != nil { + if useTLS { url.Scheme = "https" } diff --git a/http/proxy/proxy.go b/http/proxy/proxy.go index 039227ee..30d703c1 100644 --- a/http/proxy/proxy.go +++ b/http/proxy/proxy.go @@ -12,12 +12,12 @@ import ( ) // AgentHTTPRequest redirects a HTTP request to another agent. -func AgentHTTPRequest(rw http.ResponseWriter, request *http.Request, target *agent.ClusterMember) { +func AgentHTTPRequest(rw http.ResponseWriter, request *http.Request, target *agent.ClusterMember, useTLS bool) { urlCopy := request.URL urlCopy.Host = target.IPAddress + ":" + target.Port urlCopy.Scheme = "http" - if request.TLS != nil { + if useTLS { urlCopy.Scheme = "https" } From f837e83ff1bb22ccac4114e15f182509b4d52579 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Thu, 19 Sep 2019 17:15:53 +1200 Subject: [PATCH 19/29] feat(http): send short-lived ping request before resourceList request --- dev.sh | 1 - http/proxy/cluster.go | 42 +++++++++++++++++++++++++++++++++++++++++- 2 files changed, 41 insertions(+), 2 deletions(-) diff --git a/dev.sh b/dev.sh index 66851249..caaf2c73 100755 --- a/dev.sh +++ b/dev.sh @@ -97,7 +97,6 @@ function deploy_swarm() { --mount type=bind,src=//,dst=/host \ --publish target=9001,published=9001 \ --publish mode=host,published=80,target=80 \ - --restart-condition none \ "${IMAGE_NAME}" # --mount type=volume,src=portainer_agent_data,dst=/data \ diff --git a/http/proxy/cluster.go b/http/proxy/cluster.go index ed7ecf4b..280efebd 100644 --- a/http/proxy/cluster.go +++ b/http/proxy/cluster.go @@ -3,6 +3,8 @@ package proxy import ( "bytes" "crypto/tls" + "errors" + "fmt" "io/ioutil" "net/http" "sync" @@ -15,7 +17,8 @@ const defaultClusterRequestTimeout = 120 // ClusterProxy is a service used to execute the same requests on multiple targets. type ClusterProxy struct { - client *http.Client + client *http.Client + pingClient *http.Client } // NewClusterProxy returns a pointer to a ClusterProxy. @@ -32,6 +35,12 @@ func NewClusterProxy() *ClusterProxy { TLSClientConfig: tlsConfig, }, }, + pingClient: &http.Client{ + Timeout: time.Second * 3, + Transport: &http.Transport{ + TLSClientConfig: tlsConfig, + }, + }, } } @@ -84,9 +93,40 @@ func (clusterProxy *ClusterProxy) executeRequestOnCluster(request *http.Request, wg.Wait() } +func (clusterProxy *ClusterProxy) pingAgent(request *http.Request, member *agent.ClusterMember) error { + agentScheme := "http" + if request.TLS != nil { + agentScheme = "https" + } + + agentURL := fmt.Sprintf("%s://%s:%s/ping", agentScheme, member.IPAddress, member.Port) + + pingRequest, err := http.NewRequest(http.MethodGet, agentURL, nil) + if err != nil { + return err + } + + response, err := clusterProxy.pingClient.Do(pingRequest) + if err != nil { + return err + } + + if response.StatusCode != http.StatusNoContent { + return errors.New("agent ping request failed") + } + + return nil +} + func (clusterProxy *ClusterProxy) copyAndExecuteRequest(request *http.Request, member *agent.ClusterMember, ch chan agentRequestResult, wg *sync.WaitGroup) { defer wg.Done() + err := clusterProxy.pingAgent(request, member) + if err != nil { + ch <- agentRequestResult{err: err, nodeName: member.NodeName} + return + } + requestCopy, err := copyRequest(request, member) if err != nil { ch <- agentRequestResult{err: err} From edcdfb3264373b340983ceaaa37bf123d9993960 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Thu, 19 Sep 2019 17:31:20 +1200 Subject: [PATCH 20/29] feat(http): add more logging around failed proxy requests --- dev.sh | 1 - http/handler/docker/docker_operation.go | 3 +++ http/proxy/agentproxy.go | 2 ++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/dev.sh b/dev.sh index 66851249..caaf2c73 100755 --- a/dev.sh +++ b/dev.sh @@ -97,7 +97,6 @@ function deploy_swarm() { --mount type=bind,src=//,dst=/host \ --publish target=9001,published=9001 \ --publish mode=host,published=80,target=80 \ - --restart-condition none \ "${IMAGE_NAME}" # --mount type=volume,src=portainer_agent_data,dst=/data \ diff --git a/http/handler/docker/docker_operation.go b/http/handler/docker/docker_operation.go index 61164458..f94cdfd7 100644 --- a/http/handler/docker/docker_operation.go +++ b/http/handler/docker/docker_operation.go @@ -2,6 +2,7 @@ package docker import ( "errors" + "log" "net/http" "strings" @@ -63,6 +64,7 @@ func (handler *Handler) executeOperationOnManagerNode(rw http.ResponseWriter, re } else { targetMember := handler.clusterService.GetMemberByRole(agent.NodeRoleManager) if targetMember == nil { + log.Printf("[ERROR] [http,docker,proxy] [request: %s] [message: unable to redirect request to a manager node: no manager node found]", request.URL) return &httperror.HandlerError{http.StatusInternalServerError, "The agent was unable to contact any other agent located on a manager node", errors.New("Unable to find an agent on any manager node")} } proxy.AgentHTTPRequest(rw, request, targetMember) @@ -78,6 +80,7 @@ func (handler *Handler) executeOperationOnNode(rw http.ResponseWriter, request * } else { targetMember := handler.clusterService.GetMemberByNodeName(agentTargetHeader) if targetMember == nil { + log.Printf("[ERROR] [http,docker,proxy] [target_node: %s] [request: %s] [message: unable to redirect request to specified node: agent not found in cluster]", agentTargetHeader, r.URL) return &httperror.HandlerError{http.StatusInternalServerError, "The agent was unable to contact any other agent", errors.New("Unable to find the targeted agent")} } diff --git a/http/proxy/agentproxy.go b/http/proxy/agentproxy.go index 2abf96d1..4abd5af8 100644 --- a/http/proxy/agentproxy.go +++ b/http/proxy/agentproxy.go @@ -2,6 +2,7 @@ package proxy import ( "errors" + "log" "net/http" "github.com/portainer/agent" @@ -38,6 +39,7 @@ func (p *AgentProxy) Redirect(next http.Handler) http.Handler { } else { targetMember := p.clusterService.GetMemberByNodeName(agentTargetHeader) if targetMember == nil { + log.Printf("[ERROR] [http,agent,proxy] [target_node: %s] [request: %s] [message: unable to redirect request to specified node: agent not found in cluster]", agentTargetHeader, r.URL) return &httperror.HandlerError{http.StatusInternalServerError, "The agent was unable to contact any other agent", errors.New("Unable to find the targeted agent")} } AgentHTTPRequest(rw, r, targetMember) From 085f42cd02e74fffe84ef72b309fb073ad3f377c Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Fri, 20 Sep 2019 08:52:49 +1200 Subject: [PATCH 21/29] feat(http): disable keep alive for resource aggregation client --- http/proxy/cluster.go | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/http/proxy/cluster.go b/http/proxy/cluster.go index ed7ecf4b..db33fd2e 100644 --- a/http/proxy/cluster.go +++ b/http/proxy/cluster.go @@ -29,7 +29,8 @@ func NewClusterProxy() *ClusterProxy { client: &http.Client{ Timeout: time.Second * defaultClusterRequestTimeout, Transport: &http.Transport{ - TLSClientConfig: tlsConfig, + TLSClientConfig: tlsConfig, + DisableKeepAlives: true, }, }, } From 5f5c38741bf2765d514e6b65df690defa7ddca6c Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Fri, 20 Sep 2019 08:57:40 +1200 Subject: [PATCH 22/29] feat(http): disable keep alive for short ping client --- http/proxy/cluster.go | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/http/proxy/cluster.go b/http/proxy/cluster.go index 280efebd..5d1dfefe 100644 --- a/http/proxy/cluster.go +++ b/http/proxy/cluster.go @@ -38,7 +38,8 @@ func NewClusterProxy() *ClusterProxy { pingClient: &http.Client{ Timeout: time.Second * 3, Transport: &http.Transport{ - TLSClientConfig: tlsConfig, + TLSClientConfig: tlsConfig, + DisableKeepAlives: true, }, }, } From 049f0868ff24eaca93d6e881a511acc0804e072f Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Fri, 20 Sep 2019 09:33:21 +1200 Subject: [PATCH 23/29] feat(http): specify Read/Write timeouts for Server --- http/server.go | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/http/server.go b/http/server.go index 579aa05c..dcaada81 100644 --- a/http/server.go +++ b/http/server.go @@ -3,6 +3,7 @@ package http import ( "log" "net/http" + "time" "github.com/portainer/agent" "github.com/portainer/agent/http/handler" @@ -66,7 +67,15 @@ func (server *APIServer) StartUnsecured() error { listenAddr := server.addr + ":" + server.port log.Printf("[INFO] [http] [server_addr: %s] [server_port: %s] [secured: %t] [api_version: %s] [message: Starting Agent API server]", server.addr, server.port, config.Secured, agent.Version) - return http.ListenAndServe(listenAddr, h) + + httpServer := &http.Server{ + Addr: listenAddr, + Handler: h, + ReadTimeout: 5 * time.Second, + WriteTimeout: 120 * time.Second, + } + + return httpServer.ListenAndServe() } // Start starts a new web server by listening on the specified listenAddr. @@ -86,5 +95,13 @@ func (server *APIServer) StartSecured() error { listenAddr := server.addr + ":" + server.port log.Printf("[INFO] [http] [server_addr: %s] [server_port: %s] [secured: %t] [api_version: %s] [message: Starting Agent API server]", server.addr, server.port, config.Secured, agent.Version) - return http.ListenAndServeTLS(listenAddr, agent.TLSCertPath, agent.TLSKeyPath, h) + + httpServer := &http.Server{ + Addr: listenAddr, + Handler: h, + ReadTimeout: 5 * time.Second, + WriteTimeout: 120 * time.Second, + } + + return httpServer.ListenAndServeTLS(agent.TLSCertPath, agent.TLSKeyPath) } From b3251265ebed93dc7e5b0e4c1513518cf34089bc Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Fri, 20 Sep 2019 16:10:08 +1200 Subject: [PATCH 24/29] fix(http): fix method after invalid conflict merge --- http/proxy/cluster.go | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/http/proxy/cluster.go b/http/proxy/cluster.go index 84579f2e..d197d219 100644 --- a/http/proxy/cluster.go +++ b/http/proxy/cluster.go @@ -133,7 +133,7 @@ func (clusterProxy *ClusterProxy) copyAndExecuteRequest(request *http.Request, m return } - requestCopy, err := copyRequest(request, member) + requestCopy, err := copyRequest(request, member, clusterProxy.useTLS) if err != nil { ch <- agentRequestResult{err: err, nodeName: member.NodeName} return From 171cda009dd15c0a9342f503d85f88352a01461d Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Fri, 20 Sep 2019 16:20:09 +1200 Subject: [PATCH 25/29] refactor(http): fix invalid variable name --- http/handler/docker/docker_operation.go | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/http/handler/docker/docker_operation.go b/http/handler/docker/docker_operation.go index ebd1b332..28faaa80 100644 --- a/http/handler/docker/docker_operation.go +++ b/http/handler/docker/docker_operation.go @@ -80,7 +80,7 @@ func (handler *Handler) executeOperationOnNode(rw http.ResponseWriter, request * } else { targetMember := handler.clusterService.GetMemberByNodeName(agentTargetHeader) if targetMember == nil { - log.Printf("[ERROR] [http,docker,proxy] [target_node: %s] [request: %s] [message: unable to redirect request to specified node: agent not found in cluster]", agentTargetHeader, r.URL) + log.Printf("[ERROR] [http,docker,proxy] [target_node: %s] [request: %s] [message: unable to redirect request to specified node: agent not found in cluster]", agentTargetHeader, request.URL) return &httperror.HandlerError{http.StatusInternalServerError, "The agent was unable to contact any other agent", errors.New("Unable to find the targeted agent")} } From 5d9e684432f87abff5db0a2b9708535581b7c082 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Fri, 20 Sep 2019 16:53:39 +1200 Subject: [PATCH 26/29] fix(http): fix missing variable init --- http/handler/docker/handler.go | 1 + 1 file changed, 1 insertion(+) diff --git a/http/handler/docker/handler.go b/http/handler/docker/handler.go index b3ac75a7..683f07d9 100644 --- a/http/handler/docker/handler.go +++ b/http/handler/docker/handler.go @@ -27,6 +27,7 @@ func NewHandler(clusterService agent.ClusterService, agentTags map[string]string clusterProxy: proxy.NewClusterProxy(useTLS), clusterService: clusterService, agentTags: agentTags, + useTLS: useTLS, } h.PathPrefix("/").Handler(notaryService.DigitalSignatureVerification(httperror.LoggerHandler(h.dockerOperation))) From 65e22f304be719c349b80630d2c330085ec287db Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Tue, 1 Oct 2019 06:42:29 +1300 Subject: [PATCH 27/29] fix(http): prevent '..' in filename for browsePut operation --- filesystem/filesystem.go | 6 +++--- http/handler/browse/browse_delete.go | 2 +- http/handler/browse/browse_get.go | 2 +- http/handler/browse/browse_list.go | 2 +- http/handler/browse/browse_put.go | 7 ++++++- http/handler/browse/browse_rename.go | 4 ++-- 6 files changed, 14 insertions(+), 9 deletions(-) diff --git a/filesystem/filesystem.go b/filesystem/filesystem.go index aa70e18a..24a4b1ec 100644 --- a/filesystem/filesystem.go +++ b/filesystem/filesystem.go @@ -101,13 +101,13 @@ func RenameFile(oldPath, newPath string) error { // WriteFile takes a path, filename, a file and the mode that should be associated // to the file and writes it to disk -func WriteFile(uploadedFilePath, filename string, file []byte, mode uint32) error { - err := os.MkdirAll(uploadedFilePath, 0755) +func WriteFile(folder, filename string, file []byte, mode uint32) error { + err := os.MkdirAll(folder, 0755) if err != nil { return err } - filePath := path.Join(uploadedFilePath, filename) + filePath := path.Join(folder, filename) err = ioutil.WriteFile(filePath, file, os.FileMode(mode)) if err != nil { diff --git a/http/handler/browse/browse_delete.go b/http/handler/browse/browse_delete.go index 2ddec42f..dfd5d526 100644 --- a/http/handler/browse/browse_delete.go +++ b/http/handler/browse/browse_delete.go @@ -10,7 +10,7 @@ import ( "github.com/portainer/libhttp/response" ) -// DELETE request on /browse/delete?id=:id&path=:path +// DELETE request on /browse/delete?volumeID=:id&path=:path func (handler *Handler) browseDelete(rw http.ResponseWriter, r *http.Request) *httperror.HandlerError { volumeID, _ := request.RetrieveQueryParameter(r, "volumeID", true) if volumeID == "" && !handler.agentOptions.HostManagementEnabled { diff --git a/http/handler/browse/browse_get.go b/http/handler/browse/browse_get.go index aa04a437..7c2c0745 100644 --- a/http/handler/browse/browse_get.go +++ b/http/handler/browse/browse_get.go @@ -9,7 +9,7 @@ import ( "github.com/portainer/libhttp/request" ) -// GET request on /browse/get?id=:id&path=:path +// GET request on /browse/get?volumeID=:id&path=:path func (handler *Handler) browseGet(rw http.ResponseWriter, r *http.Request) *httperror.HandlerError { volumeID, _ := request.RetrieveQueryParameter(r, "volumeID", true) if volumeID == "" && !handler.agentOptions.HostManagementEnabled { diff --git a/http/handler/browse/browse_list.go b/http/handler/browse/browse_list.go index 65bdfe8d..60ac9169 100644 --- a/http/handler/browse/browse_list.go +++ b/http/handler/browse/browse_list.go @@ -10,7 +10,7 @@ import ( "github.com/portainer/libhttp/response" ) -// GET request on /browse/ls?id=:id&path=:path +// GET request on /browse/ls?volumeID=:id&path=:path func (handler *Handler) browseList(rw http.ResponseWriter, r *http.Request) *httperror.HandlerError { volumeID, _ := request.RetrieveQueryParameter(r, "volumeID", true) if volumeID == "" && !handler.agentOptions.HostManagementEnabled { diff --git a/http/handler/browse/browse_put.go b/http/handler/browse/browse_put.go index 38df75db..0480225f 100644 --- a/http/handler/browse/browse_put.go +++ b/http/handler/browse/browse_put.go @@ -33,7 +33,7 @@ func (payload *browsePutPayload) Validate(r *http.Request) error { return nil } -// POST request on /browse/put?id=:id +// POST request on /browse/put?volumeID=:id func (handler *Handler) browsePut(rw http.ResponseWriter, r *http.Request) *httperror.HandlerError { volumeID, _ := request.RetrieveQueryParameter(r, "volumeID", true) if volumeID == "" && !handler.agentOptions.HostManagementEnabled { @@ -51,6 +51,11 @@ func (handler *Handler) browsePut(rw http.ResponseWriter, r *http.Request) *http if err != nil { return &httperror.HandlerError{http.StatusBadRequest, "Invalid volume", err} } + + _, err = filesystem.BuildPathToFileInsideVolume(volumeID, payload.Filename) + if err != nil { + return &httperror.HandlerError{http.StatusBadRequest, "Invalid filename", err} + } } err = filesystem.WriteFile(payload.Path, payload.Filename, payload.File, 0755) diff --git a/http/handler/browse/browse_rename.go b/http/handler/browse/browse_rename.go index f5a6a01d..2e223415 100644 --- a/http/handler/browse/browse_rename.go +++ b/http/handler/browse/browse_rename.go @@ -26,7 +26,7 @@ func (payload *browseRenamePayload) Validate(r *http.Request) error { return nil } -// PUT request on /browse/rename?id=:id +// PUT request on /browse/rename?volumeID=:id func (handler *Handler) browseRename(rw http.ResponseWriter, r *http.Request) *httperror.HandlerError { volumeID, _ := request.RetrieveQueryParameter(r, "volumeID", true) if volumeID == "" && !handler.agentOptions.HostManagementEnabled { @@ -58,7 +58,7 @@ func (handler *Handler) browseRename(rw http.ResponseWriter, r *http.Request) *h return response.Empty(rw) } -// PUT request on /v1/browse/rename?id=:id +// PUT request on /v1/browse/:id/rename func (handler *Handler) browseRenameV1(rw http.ResponseWriter, r *http.Request) *httperror.HandlerError { volumeID, err := request.RetrieveRouteVariableValue(r, "id") if err != nil { From 76d820b6ec4d626fcb022ba0244c92802669ade0 Mon Sep 17 00:00:00 2001 From: Itsconquest Date: Thu, 3 Oct 2019 13:02:30 +1300 Subject: [PATCH 28/29] docs(project): add security info --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index 658fc634..5cb94c03 100644 --- a/README.md +++ b/README.md @@ -10,6 +10,10 @@ Containers, networks, volumes and images are node specific resources, not cluste The purpose of the agent aims to allows previously node specific resources to be cluster-aware, all while keeping the Docker API request format. As aforementioned, this means that you only need to execute one Docker API request to retrieve all these resources from every node inside the cluster. In all bringing a better Docker user experience when managing Swarm clusters. +## Security + +Here at Portainer, we believe in [responsible disclosure](https://en.wikipedia.org/wiki/Responsible_disclosure) of security issues. If you have found a security issue, please report it to . + ## Technical details The Portainer agent is basically a cluster of Docker API proxies. Deployed inside a Swarm cluster on each node, it allows the From 4388f8357112dabb04dc04b9020f63a45a333291 Mon Sep 17 00:00:00 2001 From: Anthony Lapenna Date: Fri, 11 Oct 2019 11:00:32 +1300 Subject: [PATCH 29/29] chore(version): bump version number --- agent.go | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/agent.go b/agent.go index 5808fe7f..7f4e7dda 100644 --- a/agent.go +++ b/agent.go @@ -138,7 +138,7 @@ type ( const ( // Version represents the version of the agent. - Version = "1.4.0" + Version = "1.5.0" // APIVersion represents the version of the agent's API. APIVersion = "2" // DefaultAgentAddr is the default address used by the Agent API server.