Bug 2948 - implement "copy-data" sftp extension
Summary: implement "copy-data" sftp extension
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sftp (show other bugs)
Version: -current
Hardware: All All
: P5 enhancement
Assignee: Assigned to nobody
URL: https://tools.ietf.org/html/draft-iet...
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-29 09:33 AEDT by Mike Frysinger
Modified: 2023-03-17 13:41 AEDT (History)
1 user (show)

See Also:


Attachments
sftp server copy-data extension (5.93 KB, patch)
2019-12-23 09:49 AEDT, Darren Tucker
no flags Details | Diff
sftp client copy-data extension (7.48 KB, patch)
2019-12-23 11:04 AEDT, Darren Tucker
no flags Details | Diff
sftp server copy-data extension v2 (6.86 KB, patch)
2021-04-05 00:57 AEST, Mike Frysinger
no flags Details | Diff
sftp client copy-data extension v2 (8.23 KB, patch)
2021-04-05 00:58 AEST, Mike Frysinger
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Frysinger 2018-12-29 09:33:58 AEDT
i posted some patches [1] to implement the "copy-data" sftp extension as defined in the draft IETF spec [2].  is there interest or explicit disinterest in adding support for extensions like this ?

[1] https://lists.mindrot.org/pipermail/openssh-unix-dev/2018-November/037351.html
[2] https://tools.ietf.org/html/draft-ietf-secsh-filexfer-extensions-00#section-7
Comment 1 Mike Frysinger 2019-12-23 05:50:20 AEDT
is there anything i can do to help move this forward ?
Comment 2 Darren Tucker 2019-12-23 09:49:50 AEDT
Created attachment 3344 [details]
sftp server copy-data extension

Attach patch for commenting.
Comment 3 Darren Tucker 2019-12-23 10:53:33 AEDT
Comment on attachment 3344 [details]
sftp server copy-data extension

> [...] This avoids needing to
>+download the data before uploading it across the network twice.

Why uploading twice?

[...]
>+	{ "copy-data", "copy-data", 0, process_extended_copy_data, 1 },

This is a protocol violation.  The server is implementing secsh-filexfer-02 which does not specify copy-data.  Vendor extensions such as this need an @[domainname] suffix.

[...]
>+				ret = read(read_fd, buf, len);

ret should be ssize_t not int.

That said I don't think we'd consider this without a client implementation since without that we can't test it.
Comment 4 Darren Tucker 2019-12-23 11:02:01 AEDT
(In reply to Darren Tucker from comment #3)
[..]
> That said I don't think we'd consider this without a client
> implementation since without that we can't test it.

ignore that comment, I see the followup patch.
Comment 5 Darren Tucker 2019-12-23 11:04:16 AEDT
Created attachment 3345 [details]
sftp client copy-data extension
Comment 6 Darren Tucker 2019-12-23 11:25:13 AEDT
Comment on attachment 3345 [details]
sftp client copy-data extension

[...]
>+.It Ic copy Ar oldpath Ar newpath

the man page says "copy" but the code says "cp".  Personally I prefer "cp".

[...]
> +	} else {
>+		mode = 0666;

I'm not sure we should be making world writable files by default.

>+	if (old_handle == NULL) {
>+		sshbuf_free(msg);
>+		return -1;
>+	}
>+
>+	/* Open the new file for writing */
[...]
>+	if (new_handle == NULL) {
>+		sshbuf_free(msg);
>+		return -1;
+	}

I think this leaks old_handle if opening new_handle fails.  It'd probably be cleaner to initialise status and the handles to -1/NULL then do a "goto out" and have all the cleanup in one place.
Comment 7 Darren Tucker 2019-12-23 12:39:47 AEDT
(In reply to Darren Tucker from comment #3)
[...]
> >+	{ "copy-data", "copy-data", 0, process_extended_copy_data, 1 },
> 
> This is a protocol violation.  The server is implementing
> secsh-filexfer-02 which does not specify copy-data.  Vendor
> extensions such as this need an @[domainname] suffix.

Rereading the draft I think I was wrong about this.
Comment 8 Mike Frysinger 2020-01-01 22:10:46 AEDT
(In reply to Darren Tucker from comment #3)

thx for the reviews.

> > [...] This avoids needing to
> >+download the data before uploading it across the network twice.
> 
> Why uploading twice?

i was referring to network transfers rather than upload specifically.  i'll rephrase to be more clear.

> >+	{ "copy-data", "copy-data", 0, process_extended_copy_data, 1 },
> 
> This is a protocol violation.  The server is implementing
> secsh-filexfer-02 which does not specify copy-data.  Vendor
> extensions such as this need an @[domainname] suffix.

the secsh-filexfer-02 spec [1] says:
> The name should be of the form "name@domain", where the domain is the DNS domain name of the organization defining the extension.
> Additional names that are not of this format may be defined later by the IETF.

since the IETF has an RFC that defines "copy-data" [2], and that's what i implemented, that's why i think it's appropriate to omit the @ vendor suffix.

[1] https://tools.ietf.org/html/draft-ietf-secsh-filexfer-02#section-4
[2] https://tools.ietf.org/html/draft-ietf-secsh-filexfer-extensions-00#section-7

> >+				ret = read(read_fd, buf, len);
> 
> ret should be ssize_t not int.

i'm aware of the API nuance, but i went with "follow existing style".  specifically, process_read() uses int with read(), and i mirrored that.  if that func is an oversight and should also be ssize_t, i'm happy to change.

(In reply to Darren Tucker from comment #6)
> >+.It Ic copy Ar oldpath Ar newpath
> 
> the man page says "copy" but the code says "cp".  Personally I
> prefer "cp".

nice catch.  sftp uses aliases for other commands, so i'll keep "copy" as the documented one, and keep "cp" as a shorter alias.  this keeps with existing "chdir" and "cd" behavior.

> > +	} else {
> >+		mode = 0666;
> 
> I'm not sure we should be making world writable files by default.

it isn't by default exactly ... that mode is used if the server doesn't respond with mode info for the source file which should be fairly unusual.

ignoring that, the server umask should apply.  so if the user's umask is like 0022, i'd expect this to be 0644.

that said, this is also a case of me copying existing style -- this is how do_download() behaves.

i can switch it to 0644 if people really prefer, but it seemed like all things considered, 0666 is correct.

> >+	if (old_handle == NULL) {
> >+		sshbuf_free(msg);
> >+		return -1;
> >+	}
> >+
> >+	/* Open the new file for writing */
> [...]
> >+	if (new_handle == NULL) {
> >+		sshbuf_free(msg);
> >+		return -1;
> +	}
> 
> I think this leaks old_handle if opening new_handle fails.  It'd
> probably be cleaner to initialise status and the handles to -1/NULL
> then do a "goto out" and have all the cleanup in one place.

i'll add an explicit free.  i didn't use goto style since this file doesn't do that for error handling in general.
Comment 9 Damien Miller 2020-08-28 13:43:36 AEST
Comment on attachment 3344 [details]
sftp server copy-data extension

looks good - some minor comments

>diff --git a/PROTOCOL b/PROTOCOL
>index f75c1c0ae5b0..04a392db33be 100644
...
>+static void
>+process_extended_copy_data(u_int32_t id)
...
>+	/* Disallow reading & writing to the same handle */
>+	if (read_handle == write_handle || read_fd < 0 || write_fd < 0) {

I think this should also check that both the read and write handles do not refer to the same path? (use handle_to_name())

>+		status = SSH2_FX_FAILURE;
>+	} else {

nit: prefer "goto out" over nesting if/else

>+		if (lseek(read_fd, read_off, SEEK_SET) < 0) {
>+			status = errno_to_portable(errno);
>+			error("process_extended_copy_data: read_seek failed");

nit: ditto

>+		} else if (!(handle_to_flags(write_handle) & O_APPEND) &&
>+				lseek(write_fd, write_off, SEEK_SET) < 0) {
>+			status = errno_to_portable(errno);
>+			error("process_extended_copy_data: write_seek failed");

nit: prefer __func__ to manual inclusion of function name

>+		} else {
>+			/* Process the request in chunks. */
>+			while (read_len || copy_until_eof) {

nit: prefer explicit comparison against zero (i.e "read_len > 0")

>+
>+				ret = read(read_fd, buf, len);
...
>+				ret = write(write_fd, buf, len);

I think this should use atomicio here to be signal safe.

>+				if ((size_t)ret != len) {
>+					debug2("nothing at all written");
>+					status = SSH2_FX_FAILURE;
>+					break;
>+				}

this block can go away with atomicio
Comment 10 Damien Miller 2020-08-28 13:47:23 AEST
Comment on attachment 3345 [details]
sftp client copy-data extension

This too looks good, minor comments:

>diff --git a/sftp-client.c b/sftp-client.c
>index 4986d6d8d291..cd2844a8585e 100644
>--- a/sftp-client.c
>+++ b/sftp-client.c
...
>+int
>+do_copy(struct sftp_conn *conn, const char *oldpath, const char *newpath)
>+{
...
>+	/* Silently return if the extension is not supported */
>+	if ((conn->exts & SFTP_EXT_COPY_DATA) == 0) {
>+		error("Server does not support copy-data extension");

This is not silent :)

>diff --git a/sftp.1 b/sftp.1
>index 0fd54cae090e..f2eae7f32790 100644
>--- a/sftp.1
>+++ b/sftp.1
...
>+.Ic lchdir , copy , chmod , chown ,

the manpage says the command is "copy", but ...

>diff --git a/sftp.c b/sftp.c
>index 7db86c2d3cf0..3288279172a9 100644
>--- a/sftp.c
>+++ b/sftp.c
...
>+	{ "cp",		I_COPY,		REMOTE	},

... it's implemented as "cp"

Either/both is fine, but it needs to be consistent of course.
Comment 11 Damien Miller 2020-08-28 13:49:19 AEST
Comment on attachment 3344 [details]
sftp server copy-data extension

>+	/* Disallow reading & writing to the same handle */
>+	if (read_handle == write_handle || read_fd < 0 || write_fd < 0) {

Maybe mention here that this also ensures that both handles refer to files rather than directories
Comment 12 Mike Frysinger 2020-11-30 17:03:44 AEDT
thanks for the feedback.  i'll take another pass at this once the limits extension lands since they're touching a lot of the same code and i don't feel like fighting rebase conflicts while review is ongoing.
Comment 13 Mike Frysinger 2021-04-05 00:57:55 AEST
Created attachment 3494 [details]
sftp server copy-data extension v2

this should address all the feedback provided
Comment 14 Mike Frysinger 2021-04-05 00:58:28 AEST
Created attachment 3495 [details]
sftp client copy-data extension v2

this should address all the feedback
Comment 15 Mike Frysinger 2021-09-25 12:54:33 AEST
had to rebase after expand-path@openssh.com was added.  i posted the patches to the mailing list this time:
https://lists.mindrot.org/pipermail/openssh-unix-dev/2021-September/039670.html
Comment 16 Mike Frysinger 2022-03-31 17:05:47 AEDT
it's been merged now for OpenSSH-9.0.  thanks all, very exciting!
Comment 17 Damien Miller 2022-10-04 21:59:26 AEDT
Closing bugs from openssh-9.1 release cycle
Comment 18 Damien Miller 2023-03-17 13:41:11 AEDT
OpenSSH 9.3 has been released. Close resolved bugs