Paths are an extension of anchors that have been introduced in RSM. They fill the same role but add a lot more flexibility and dimensionality, and allows you to create complex indexes to quickly query the DHT easily.
You can think of paths like anchor trees, in which we don't only create one anchor entry to hold all the links to a particular type of entry, but rather create more than one, to distribute those links much more homogeneously in the DHT. If you haven't done the anchors exercise, do it now before doing the paths one.
The content of each path is a string with segments separated by a dot, for example:
all_tasks.project1.finished. This path will create these entries:
Here, you can see that the root parent of the path is
all_tasks, which has
all_tasks.project1 as a child. Each of these entries has a hash in the DHT like any other entry. Also, every parent will have a link pointing to all its children.
There are two goals we have in mind when using paths:
- Reducing DHT hotspots
If we only create one anchor entry and attach all the links to posts from that entry, the poor nodes that will be holding that entry will end up holding all those links as well - this can get big in terms of storage. Creating multiple entries makes it so that the links get distributed around in the DHT much more evenly.
- Read performance
Usually we don't want to query "all the posts that have been ever created". Imagine that you want to get the posts for the last day. If we only have one anchor entry, this can get really slow, because we need to do a
get for every post to check whether it has been made in the last day, and then return the ones that have been. Instead, if we are a bit smart in the way we create the paths, we can just query the appropriate anchors that will only hold the posts for that day.
Here you can create paths yourself, and see which entries and links are created.
The basic mechanism for which these entries are useful is to attach links to them. If you attach a link to the
all_tasks.project1.finished that points to all tasks related with
project1 that have finished, now you can do a
get_links on that path to get only those.
If, on the contrary, you want to get all tasks within the project regardless of status, you can get all the children paths from
all_tasks.project1, which will give you for example
all_tasks.project1.finished, and then do a
get_links to tasks on those.
You can imagine different types of indexes built on top of paths, with multidimensional properties.
entry_defs![ Path::entry_def(), ... ];
We need to code a small zome that satisfies these capabilities:
- Create a new post, passing a content and some tags
- Get all posts within a day or an hour, examples:
- "get me all posts posted on 21st February, 2021"
- "get me all posts posted between 21:00 and 22:00 of 21st February, 2021"
- Get all the tags that have been created
- Get all posts that have been created with a certain tag
- "get me all posts that have been posted with the tag "nature""
You can follow this entry design to accomplish it:
- Go to the
- Enter the nix-shell:
you should run this in the folder containing the default.nix file
starting the nix-shell for the very first time might take a long time, somewhere between 20 to 80 minutes, after that it will take just a few seconds
- Go to folder with the exercise
- Implement all
- Implement all
- Compile and test your code:
cd tests && npm install && npm test.
- Don't stop until the test runs green