api design

Ranter

lorentz

15282

Comments

0

SuspiciousBug

1366

3y

Use custom methods (either GET or POST):

- POST /data:lock

- POST /data:rename

- ...

(see https://cloud.google.com/apis/...)
0

AvatarOfKaine

3758

3y

YOU know this pisses me off
I recognize this and know I understood better wtf you were talking about
Which means I was more active.

"Encode to specific node metadata"

What do you mean by this ?

Encryption ? Lol
Or character set ?
Or representing the request to the file Date on the node ?
Also.. did I read that you're exposing a file system via a request path ?
0

AvatarOfKaine

3758

3y

And what do you mean by minimal constraints? Exposure of the file system ? As in what parts you can access or as in file operations and user access ?
0

stop

6594

3y

The methods introdiced with webdav have an advantage:
1. No confuusimg paths
2. public known methods, devs that use it find a reference what it means
3. Future compatibility: if you decide to implement a subset of webdav, you only have to change the contenttype or have an wrapper that formats the output based on the queried contenttype.
0

hitko

3005

3y

"I have a bunch of nails and a hammer. I also need some screws to combine with those nails, so I wonder how I go about adding a screwdriver to my hammer?" You don't, that's how.
1

Oktokolo

11330

3y

Make metadata part of the tree. The most naive variant would have /dir1/children/dir2/children/file/acl pointing at the ACL of /dir1/dir2/file. If you forbid another character in names, you can condense that into /dir1/dir2/file:acl:group:read pointing at the read flag for group in the ACL and the content being represented by /dir1/dir2/file:content or just /dir1/dir2/file.

The major benefit of path-based distinction between meta data and primary content is that the identification of what is actually accessed is right in your face making it easy to spot when accessing the wrong thing.
1

lorentz

15282

3y

@Oktokolo This looks very clean, I'm concerned about injection vulnerabilities though. If at any point something has access to an ACL and accepts a path without filtering it for forbidden characters, it may be abused to modify the ACL.
0

lorentz

15282

3y

The first option with the children prefix is safe but pretty ugly.
0

lorentz

15282

3y

@horus I suspect headers would become a bottleneck because it's possible that users will want to store and retrieve lots of tiny files where eg. including the ACL in every request would increase the transmitted data tenfold
0

lorentz

15282

3y

@stop I don't want to support WebDAV because it is built entirely on a version of the XML namespace standard that is no longer in use. Supporting any subset of WebDAV would mean that my brand new API requires setting your XML parser in legacy mode from day 1. I have other complaints about WebDAV's property system too, but this is the main one that forces me to break compat altogether.
0

lorentz

15282

3y

It's worth considering designing the protocol in such a way that it doesn't collide with WebDAV so that an implementation can support both on the same URL, but this is not a priority since a sibling path can always be used for WebDAV access.
1

netikras

34605

3y

You cannot use /meta at the end of the url - there may be a file with that name. Restricting/reserving characters is also poor option as this this creates usage limitations.

Protocols tend to specify metadata at tge beginning of the frame, so in your case that would be beginning of the url: /meta/stat/path/to/file or /meta/acl/path/to/file or /data/path/to/file. This way you can separately get/post file and its metadata. And creates no limitations and imo is a clean approach.

Another soln could be through headers - using a header you could specify you are aiming for metadata rather than data. To me that would be a second best option when prefixing with req meta path is not possible
0

lorentz

15282

3y

@netikras In the case of the header option, do you think a magic mimetype, magic range, or a custom header is preferable?

1st opt:
Accept: application/json+meta
Content-Type: application/json+meta
2nd opt:
Range: metadata
3rd opt:
X-Meta-Level: metadata (default is data)

Now thinking about it, 1 and possibly 2 could also be prone to injection attacks since middlemen may allow clients to influence the value of these headers.
1

netikras

34605

3y

@lorentz I don't think Range fits the bill, as per https://developer.mozilla.org/en-US... it should have a strict format.

Content-type - ... IDK, feels iffy to me.

If I had to, I'd go with a custom header.
0

IntrusionCM

13947

2y

@lorentz I'm wondering what you mean by injection attacks....

I have a vague idea what you could mean, but I think you're overcomplicating stuff without a clear reason.

Man in the middle would mean that your API is http only without any form of authentication, hence my question.

Regarding WebDAV: Don't.

WebDAV is kind of an own universe, as it tends to many different things. If you want lean and mean, then WebDAV is like the people you see in TV shows like "My life with 300 kilograms".

The other question I have is what you mean concrete by metadata.

Headers are not endlessly long, they *should* be in ASCII (though this differs based on webserver implementation)

Header length varies by server and client implementations, if my brain serves me right, it was 4k to 8k...

Http 2 explicitly states that header names should be *sent* (network side) in lower case, however many applications (application side) still represent it with upper chars.

So two important things: ASCII and max-length of headers.

Headers are received usually without the body - an often underestimated performance win. Fast evaluation allows to prematurely end the connection if people try funky stuff...

Reason why *security* related information like the host header, content encoding etc. *must* be encoded in the headers.

So this might be another reason to encode metadata, despite the size limit, inside headers.

X-Meta-Level.

I really disrecommend two things:
A) Using the X-prefix
2) "misusage" of existing headers

1) see e.g. https://rfc-editor.org/rfc/rfc6648/
Use a vendor prefix, e.g. lorentz, if it tickles your fetish. Otherwise just name it properly so it is distinct from known header names (see IANA).

2) don't. many webservers and client implementations exist. Unless you can prove that you break none of them and your design doesn't cause side effects (which is impossible)… don't.

Still would be interested what you mean by metadata?
1

lorentz

15282

2y

@netikras My bad, I was going off of https://httpwg.org/specs/... but I misread the rule for extensible ranges. Ranges need a distinguishing name, so a compliant format could be

Range: meta-level=metadata

I want to use alternative range specifiers more conservatively elsewhere anyway because I've been learning about z/OS and I've come to the conclusion that supporting row-based ranges on the API level improves the expressiveness of thr API calls massively, ultimately resulting in more efficient operation.

Range: rows=0-100

The benefits of this are immediately apparent in very large directory listings, which most remote file access protocols tend to struggle with.
0

lorentz

15282

2y

@IntrusionCM

On the security risks:
Consider a third party service which has extensive access to my API and exposes some functionality to its users. By using path postfixes I introduce a risk into their systems that they need to consider and mitigate. This is generally the case when a parameter has a class of low risk values and a high risk value which also matches the most convenient validation rules for the low risk class (the intended input range).

On the header:
I forgot that the X prefix is no longer recommended, thanks for the heads up.

On the metadata:
ACLs and locked state are the only two metadata fields currently, but I'm not opposed to adding more as use cases develop with the constraint that metadata managed by the file store must be relevant to the file store and not just the contents of the files. Eg. the author's identity is NOT relevant because it can go in the content where this is meaningful at all, and for security the access logs show more than headers could.
0

Oktokolo

11330

2y

@lorentz Injection is always a concern - but not hiding stuff in headers or other obscure places doesn't make injection harder or easier to do or detect.
0

lorentz

15282

2y

@Oktokolo No. In this case the problem is specifically with encoding elevated access in the postfix of a string that's otherwise safe to forward, because this makes the obvious solution unsafe and the safe solution tedious. I can't explain how injection pertains to API design any better than I did above.
0

lorentz

15282

2y

By now most people have learned not to blindly concat SQL or HTML, but literally every single codebase I've ever worked on concats HTTP paths.
1

IntrusionCM

13947

2y

@lorentz I fail to understand your argumentation...

If you are worried about illegal chars inside an URL path - that's the servers job.

A (good) server will filter already, many other servers have an explicit strict mode.

Even if an client managed to somehow bypass the server side validation, I'd still assume that you validate an incoming query.

In HTTPs there is no man in the middle.

With authentication you could be sure that the API is only used by the authenticated users.

A ':' is a valid character in an path.

Where is this idea of injection coming from?

Unless you explicitly allow it in your API / server, I don't know how.

If the client builds an invalid URL, you should be able to deal with it inside your API and return eg. Not found / Bad request status?

If you mean sth like HTTP smuggling attacks... Or header based attacks... Now that's another topic.
2

Oktokolo

11330

2y

@lorentz There is no access right encoding in the path - it just points at meta data in addition to the file. First parse the request so you know what is to be accessed, then do safety and security checks, then perform the action.

Related Rants

Add Comment

rant