Dropbox is a cloud-based file synchronization solution. Dropbox allows users to create a special folder on their computers, which Dropbox then synchronizes so that it appears to be the same folder (with the same contents) regardless of which device is used to view it.
File-based solutions provide humans with a way to store and collaborate around files organized through a hierarchy of folders. Read more.
File synchronization solutions clone all the files from a master copy onto one's local device, hence consuming some of the device's local storage capacity. Read more.
Decentralized systems distribute both the data and metadata so as to remove any potential single point of failure or bottleneck. Read more.
A controlled environment is a set of servers and clients that are under the control of a single entity e.g a company. Read more.
Redundancy mechanisms ensure the availability and durability of the files stored in the storage infrastructure. Read more.
The ability to maintain a storage system operational depends on algorithms that need to adapt depending on the nature of the environment. Read more.
Every communication is encrypted (in transit) but data, once stored, is not (at rest). Read more.
Some systems do not allow for the overall storage capacity to evolve over time. Read more.
The ways a user or program can interact with a storage system; either through specific desktop/mobile applications or standard protocols. Read more.
Object storage architecture manage data as objects. Object storage is mostly used to store application-specific, movies in the case of Netflix for example.
File storage solutions represent data in files organized in folders and subfolders. This organization of information into a hierarchical view is perfectly adapted for humans to store and collaborate. In addition, file systems often provide an access control mechanism, in particular in multi-user environments, to control which files/directories can be seen/edited by other users/groups.
Another architecture exists, known as block storage, which manages data as blocks within sectors and tracks i.e very much like a raw hard disk drive.
Some systems do not allow for the overall storage capacity to evolve over time. The ones that do are said to be scalable. Scalable storage systems could be further categorized into those that can scale without any interruption of service and those that require shutting down the system. In reality, the vast majority of existing scalable systems can scale dynamically i.e without requiring shutting down the service.
There exists two types of systems when it comes to scalability. The first category contains systems that can scale over homogeneous resources such as local disks, network-attached storage resources etc; resources that are under the control of the infrastructure's administrator.
Systems capable of heterogeneous scalability are able to integrate additional storage capacity from resources and providers of different nature being in the cloud, through an API or else. Heterogeneous-scalable storage systems often rely on at-rest encryption and Byzantine consensus algorithms to cope with potentially malicious behaviors since the integrated storage resources are often not under the control of the infrastructure's administrator.
Note that should such a system provide these mechanisms, it would exhibit most of the prerequisites to be deployed in a worldwide environment.
Many storage systems provide no redundancy mechanism and therefore cannot ensure the reliability of the data they store. As a result, should a server fail in such a system, some of (and possibly all) the data would be made unavailable ― temporarily or permanently depending on the nature of the failure.
Reliability ― represented by both durability (data stored remain in the system) and availability (data is always accessible) ― implies that a system provides some sort of a redundancy mechanism. There are basically two categories of redundancy algorithms: replication and erasure codes (which are error-correcting codes).
Replication consists in creating exact copies of a data item to ensure that, if a copy goes missing (following a server failure, corruption or else), the system can keep operating with the remaining copies.
There are many ways to decompose the original file, distribute it and replicate it between the storage media (server, disk etc.) in order to benefit from different properties. The RAID technology for instance provides several schemas, or RAID levels, to achieve different balance between reliability, availability, performance and capacity.
Erasure codes, such as Reed-Solomon, do not create raw copies of a data item as replication does. Instead, error-correcting codes transform a data block of k symbols into a longer of n symbols such that the original data block can be recovered from a subset of the n symbols.
Erasure codes are more interesting than replication when it comes to storage consumption since less storage capacity is required to achieve the same durability and availability. However, the process of writing is slower than with replication as more servers need to be contacted to host data symbols. Likewise, more computing power is required to reconstruct the original data from several pieces when accessing data, leading to more latency.
Erasure codes are more adapted to archiving while replication is often preferred for primary storage. Note however, than beyond 100TB of data, the gain in storage capacity achievable through erasure coding becomes interesting enough to be considered even for primary storage.
Note that such redundancy mechanisms differ from backups. Should a server fail, redundancy mechanisms ensure that clients can continue accessing the data transparently (property known as availability). Backups however require restoring a snapshot before the data becomes accessible again, process which can take several days during which the whole system is non-operational.
Most storage systems do not encrypt data in transit (between clients and servers) nor at rest (once stored on a server) because the environment allows it i.e the clients and servers are located in a trusted network such as a company's.
Large-scale worldwide environments require a storage solution that both encrypts in transit and at rest so that a block of data cannot be decrypted by a potentially malicious storage server. Such at-rest encryption mechanisms are necessary because worldwide environments are usually considered untrustworthy since composed of many devices under the control of unknown entities.
In some cases, for instance cloud-based storage services, only the communications (in transit) are encrypted while the blocks are stored in plain form on the servers. This is required for those solutions to be able to offer users with a Web-based interface to browse and manipulate their files online.
File-based storage systems need to take into account the multitude of platforms in use today, both mobile and desktop, for human end-users to be able to manipulate their files through well designed applications.
Object-based and block-based systems tend to comply with standard protocols to interact with the system, from SMB to specific HTTP-based APIs for cloud-based solutions. Likewise, many file-based storage systems integrate with existing protocols to ease the process of deploying clients on thousands of machines e.g NFS. Noteworthy is that many operating systems (Windows, MacOS X etc.) and mobile applications allow for the connection through such protocols to access remote file systems.
A storage solution, in particular file-based, can be accessed in different ways. Every interface has its benefits and disadvantages that are summarized next.
Synchronization technologies clone all the files from a master copy (in the cloud for instance) to one's local device. This process has the advantage of keeping all the files local at any time, ensuring access even in worst case scenarios e.g when completely disconnected from the Internet.
Unfortunately, such a technology also has its drawbacks, mainly to consume the local storage capacity of one's device. This becomes particularly problematic as the amount of data stored increases and exceeds the storage capacity available locally.
As a result, file synchronization solutions tend to provide a mechanism ― named Selective Sync for Dropbox ― for the user to select which files and folders should not be cloned locally, rendering those inaccessible from this device.
File synchronization is performed by a small program running in the background (i.e a daemon) that monitores changes between the master and local copies. POSIX-compliant file systems however is a low-level layer that emulates a hard disk drive. As such, file systems can capture low-level requests from applications and users such as reading and writing data from/to a file, adding an entry to a directory and so forth.
File systems are better suited when users need to access large amount of data without having to download everything locally since everything can be retrieved on-demand i.e streamed. The limitation of file systems lies in the need for additional mechanisms to allow users to access information when offline e.g caching.
Web interfaces are appreciated because users do not need to install an application on their computer. However, the user may lose in productivity because some file formats may not be supported for online edition e.g Photoshop, Catia etc.
Desktop applications bring the worst of both worlds as users need to learn a new way to manipulate their files, as for a Web interface, while requiring an application to be installed locally; not to mention that very few file formats are usually supported.
Most enterprise solutions provide a file system because such an interface is well-suited to environments with a lot of data (no local storage consumption) while not degrading productivity by forcing users to learn to manipulate their files through a new and specific application or constraining them regarding the support file formats.
The capacity to maintain a storage system operational in the event of failure is referred to as fault tolerance.
While some systems provide no fault tolerance mechanism, most do. Depending on the nature of the resources composing a storage infrastructure, a system will need a different algorithms to detect potential failures: bug, crash (temporary or permanent) or even malicious behavior.
The distributed storage systems that do offer some means of redundancy, often integrate a mechanism that monitors storage servers to detect potential failures. Note however that such standard fault tolerance algorithms assume that the servers will always follow the system's protocol because distributed storage solutions are generally deployed within a controlled environment e.g a company's network.
If some of the storage resources composing the infrastructure are not under the control of the administrator, the assumption cannot hold and the fault tolerance algorithm must be adapted to tolerate failures known as Byzantine i.e potentially malicous behaviors.
Finally, some systems ― in particular distributed file storage without a redundancy mechanism ― may be partially fault tolerant, allowing users to access the files that remain available while returning an error should the file be located on the server that failed.
The environment in which is deployed a storage system radically changes the constrains it must take into account for designing security, reliability (redundancy & fault tolerance) and scalability mechanisms.
The vast majority of storage solutions have been designed to operate in a controlled environment i.e a set of servers and clients that are under the control of a single entity e.g a company.
By evolving in a controlled environment, such solutions can better apprehend the infrastructure variations (latency, uptime etc.) in order to optimize the protocols and algortihms. As an example, the latency in a local area network will not be a constraint while the data will not need to be encrypted at rest since hosted on trusted servers.
Storage solutions that evolve in a globally distributed environment, also known as worldwide, face unique challenges regarding latency, trust etc.
Indeed, in such environments, the data is spread across a large number of computing devices that are under the control of different entities. As such, the blocks composing the file of a user may end up being stored on computers located in different countries e.g US, China, Germany etc.
A user may not feel comfortable knowing that his/her files are stored on someone else's computer, even though the blocks are encrypted. This is particularly worrying for businesses that could be interested in the technology but need to control where and how their files are actually stored.
Note that some solutions provide tools allowing developers to create their own storage infrastructure: Tahoe-LAFS, IPFS and Infinit. As such, a developer can decide which computers are involved in the infrastructure while defining scalability policies, allowing or not untrustworthy nodes to join the network etc. As a result, such solutions can be deployed in both controlled and worldwide environments.
The way the different servers and clients are arranged in a storage system can have a drastic impact on the overall infrastructure's performance, scalability and reliability (redundancy & fault tolerance).
Centralized storage systems store both the data and metadata on a single server. Even though such models are interesting for their simplicity, the availability and durability of the whole infrastructure depends upon a single server.
Adding more servers would increase complexity to the overall infrastructure but would allow it to support a larger number of clients. Such storage systems distribute the requests and data between the different storage servers so as to increase scalability and resilience. Distributed storage system however have the particularity of storing the metadata on specific servers, sometimes referred to as metadata servers or master servers (as opposed to data servers and slave servers).
For some large-scale infrastructure, distributed systems are limited by the capacity of metadata servers. Indeed, should there be too many clients' requests, a metadata server could fail. A load balancing mechanism being often provided, following such a failure the clients will start re-routing their requests to the other metadata servers, implicitely increasing the load on those, possibly leading to a cascading failure.
Decentralized systems are designed for extreme scalability, taking into account that nodes will fail eventually.
As a result, such systems distribute on the storage servers both the data and metadata so that no server performs a specific task. There is no metadata server that could act as a single point of failure or bottleneck.