-

This site is deprecated and will be decommissioned shortly. For current information regarding HPC visit our new site: hpc.njit.edu

AFSoverview

From NJIT-ARCS HPC Wiki
Jump to: navigation, search

Summary

OpenAFS (OAFS) is deeply woven on a very large scale into NJIT's academics, research, web, database, other services, and systems administration.

Trying to replace the cad.njit.edu OAFS cell with other, non-AFS, methods of providing equivalent

technically or economically, is probably not feasible by using a variety of methods, and is not feasible using any single method. Such a replacement would incur many downsides, including :

  • A large increase in systems administration effort, due to :
    • Markedly decreased efficiency of managing a non-AFS environment
    • Attempting to reproduce current functionality in a new environment
  • Loss of single filesystem and name space across all platforms - Linux, MacOS, Windows
  • The need to educate users regarding the new environment; large-scale documentation changes
  • Loss of the possibility of collaboration via geographically dispersed AFS cells. This has implications re: state-wide collaborations - e.g., research over high-speed networks for data that is too large to move

I. Reasons to use OpenAFS (OAFS)

OAFS has a large number of important advantages compared to other filesystems :

  1. More than 20-year history of working very well and very reliably at NJIT - deeply woven on a very large scale into academics, research, web, database, and other services, and systems administration.
  2. Clients for all platforms - Linux, Mac OSX, Windows, others.
  3. Extremely efficient administration and applications distribution
  4. Single global name space for all clients.
  5. Long history of working well and reliably at many institutions and interntational corporations.
  6. No administrator intervention in making mount points available to all clients, other than creating the mount point - a single command done on any client.
  7. Read-only replication of volumes.
  8. Scalability - number of clients, volumes, users, readily accommodated.
  9. Fine-grained ACLs.
  10. Machine-based ACLs (Heavily used in the cad.njit.edu cell)
  11. Native Kerberos integration.
  12. Simple enforcement of quotas.
  13. Reconfigurations with no user impact.
  14. On-line backup volumes.
  15. Client caching.
  16. Collaboration via cells across geographic regions.

What will be lost if there is a move off of AFS

  • Item 2. Clients for all platforms. Data served by a single application, rather than from an untried mix of applications. This greatly simplifies the delivery of software, documentation, and user home space.<
  • Item 3. Efficiency of administration and applications distribution OAFS, and by extension AuriStorFS, is simple to administer. An alternative assortment of applications would be far more complex and difficult to administer.

    AFS provides a unique environment in which to distribute software. The ability to install software in its own volume (container), and then manipulate that volume transparent to the user allows for risk-free software updates and module installations. This capabilty provides systems administrators a tool that greatly increases their efficiency.

  • Item 4. Single global name space. Different platforms will see different name spaces - i.e., balkanization. The ability to communicate the locations of directories and files common to all platforms would be lost.
  • Item 6. Mount points easily made available.. AFS uses server-side (rather than client-side) mounting of filesystems. This is done automatically, with no administrator intervention, and no root intervention on clients; a client knows on which fileserver to find files without any administrator intervention. Loss of this capability has enormous consequences in the efficiency of deploying software (both opensource and commercial) to clients.
  • Item 7. Read-only replication of volumes. Provides file availability during a fileserver outage. Could be used to advantage at disaster recovery site. Replication is built into AFS; additional software of some kind would be needed to add this capability for non-AFS products.
  • Item 8. Scalability. The scalability of replacement products is unknown, and would be limited to the capabilities of the least scalable product of the replacement set.
  • Item 9. Fine-grained ACLs Implemented ACL differences between Posix and Windows means that the common ACLs now in place across all platforms using OAFS will not be possible.
  • Item 11. Native Kerberos integration. Aligns with IDM, SSO plans.
  • Item 12. Simple enforcement of quotas. The administration of the cad.njit.edu cell relies on enforcement of quotas on many types of directories. Other filesystems may support user and group quotas, but unlike AFS, they do not support quotas on the "data-chunk" level.
  • Item 13. Reconfigurations with no user impact. Routine maintenance will mean filesystems unavailability, with resultant file unavailability. Furthermore, combined with the loss of single global name space, it will be difficult to determine which files will not be accessible during a filsever outage.
  • Item 14. On-line backup volumes. Provides users immediate access to previous day's files. Provides administrators immediate access to all of previous day's files. To achieve similar capability with other products would require deployment of some kind of software, and twice the disk space OAFS uses. Absent backup volumes, a large increase in the number of restore requests would be expected.
  • Item 15. Client caching. Significantly increases performance. Use of other products would greatly increase network traffic and file servers disk access.
  • Item 16. Collaboration via cells across geographic regions. AFS was designed with this unique capability; allows users in several cells to seamlessly access data in other cells. This possibility would be lost.

I.1 Uses of AFS at NJIT : User Services

  • Home directories
    • General-purpose academic data : Web pages, code, executables, applications output, documents, etc.
  • Course directories : hundreds per semester
  • Research directories : thousands, across about 20 departments
  • Research websites
  • Portions of DMS and CCS departmental websites
  • Club websites for clubs not using Google Sites
  • Departmental administrative documents
  • MySQL databases

I.2 Uses of AFS at NJIT : Systems Administration

Enterprise-wide deployment of :

  • Opensource and commercial software
  • Opensource libraries
  • Scripts
  • System-wide utilities
  • System-wide configuration and reference files
  • Administration program output files

II. Largest current deployments

II.1 Largest current AFS worldwide commercial deployments

These mission-critical deployments, which are not public (almost always the case with commercial cells), comprise hundreds of cells in billion-dollar corporations. This level of usage means "industry standard" on a high level.

  • GE Aircraft Systems
  • Goldman Sachs
  • IBM and all of its spinoffs
    • Lexmark
    • Lenovo's Thinkpad division
    • Hitachi's disk manufacturing
    • Global Foundaries
  • KLM
  • Morgan Stanley : 180,000 Windows 1.7.x clients
  • Qualcomm
  • United Airlines aircraft maintenance

II.2 Largest current AFS academic and research public cells

If the cell is not public, it is not known. Like most academic cells, the NJIT cad.njit.edu cell is not public.

  • Arizona State Univ.
  • Carnegie Mellon Univ.
  • Deutsches Elektronen-Synchrotron (DESY)
  • MIT
  • North Carolina State University
  • Stanford Univ.
  • Univ of Michigan
  • Univ of North Carolina Chapel Hill - Arts and Sciences
  • Univ of North Carolina Charlotte
  • Univ of Notre Dame CRC (main campus is on auto pilot)

ActivePublicCells

III. AuriStorFS as replacement for OAFS

III.1 AuriStorFS site status as of 01/19/2017

AuriStorFS, a commercial implementation of AFS with some important enhancements in performance, security, capacities, authorization, per-file ACLs, and administration relative to OAFS. AuriStorFS was founded in October 2007 as Your File System (YFS). Jeffrey Altman, AuriStor CEO, is very willing to discuss with NJIT his view of the relationship between AuriStorFS and other products.

  • Client : MIT Lincoln Labs; as of 3/2107 planning a large expansion, because applications that were never on OAFS are now being moved to AuriStor.
  • Client : Large Wall Street firm (30 cells). As of 3/2017, indicates they may double their license size.
  • Client : North Carolina State Univ. Designing the merger of three OpenAFS cells into one AuriStorFS cell
  • Client : Vanderbilt Univ Advanced Computing Center has committed to purchase. Vanderbilt University along with Univ of Tennesee Knoxville and UT Memphis are submitting an NSF Campus Cyberinfrastructure proposal (NSF 16-567 Campus Cyberinfrastructure (CC*)) that uses AuriStorFS to provide name space and security services on top of a statewide Tennessee Open Research Cloud (TORC)
  • Client : Michigan State Univ., production conversion from OpenAFS to AuriStorFS by late November 2016 - up to 400,000 users
  • Client : SLAC (National Accelerator Laboratory - Stanford Univ).
  • Client : Lulea University of Technology, Sweden.
  • Client : Univ of Maryland College Park (as of 02/17/2017). 65,000 users.
  • In contracting stage with (as of 01/19/2017) :
    • Naval Research Labs
    • University of California, Santa Cruz, 55,000 accounts. Purchasing process has begun.
  • In trial at :
    • Large multi-national bank that has not previously used AFS
    • Univ of Notre Dame; will present AuriStorFS use results at Super Computing 16 at UND booth and a BOF
  • In preliminary discussions at :
    • FBI
    • Defense Information Systems Agency (DISA)

III.2 AuriStorFS security

  1. Security Policies (Authn, Integ, Privacy) requirements on volumes and file servers. Only a file server with a security policy equal to or stricter than the volume policy can host the volume. These policies are used to enforce the proper security posture for each connection that a client uses when contacting a file server.
  2. Labels. Volumes and File Servers can be assigned arbitrary labels. A volume can only reside on a file server that has a superset of the labels assigned to the volume.
  3. The yfs-rxgk security class permits the use of the AES256-CTS-HMAC-SHA1-96 algorithm for encryption and provides perfect forward secrecy. As soon as the IETF finishes standardization the AES256-CTS-HMAC-SHA384-192 algorithm will be supported.

In addition, AuriStorFS supports multi-factor access control entries so it is possible to grant different permissions to :

  • anonymous
  • user
  • anonymous @ machine
  • machine
  • user @ machine

where "user" and "machine" are Kerberos identities.

Considerations in deploying AuriStorFS :

  • Licensing costs
  • Converting from OAFS to AuriStorFS is straightforward. However, the process of reverting from AuriStorFS to OAFS may be impractical
  • Viability of AuriStorFS, currently 4 FTE, 6 contractors

III.3 AuriStorFS and cloud storage

In AuriStor's roadmap is two-way whole file copy-and-sync between AFS directories and cloud storage (e.g., AWS/DropBox/GoogleDrive/OwnCloud) directories. When the user makes changes in the AFS directory those chaneges show up in the cloud directory, and vice versa. This capability has the added benefit that the AFS backup would also back up the cloud directories.

In addition, it will be possible to do filtering :

  • Before a file is synced from AFS to the cloud, it could go through a content filter : e.g., is this information that is not allowed to be shared? (PII, etc)
  • Before a file is synced from the cloud to AFS it could go through a content filter : e.g., virus scanning

The abstraction layer for AFS is the same regardless of the type of back-end vice partition (where data is stored) : clients access/manipulate files in exactly the same way, regardless of whether they are local to the fileserver, or, e.g., stored in an Amazon Web Services S3 back-end, or any other other type of vice partition.

Note that for the workflows required for a robust research-focused academic experience, various aspects of course work, web pages querying against academic Oracle and MySQL databases, collaboration, data security (Outsourcing data storage), long-term backups, and providing core institutional services, an institutional file system is required.

III.4 AuriStorFS and HPC

AuriStor, with its greatly enhanced performance compared to OAFS, would move NJIT HPC closer to the goals of the Tartan HPC Initiative by replacing the separate NFS-hosted /home directories currently deployed on Kong and Stheno with AFS-mounted directories, thus providing consistent storage for researchers using both clusters.

III.5 AuriStorFS and Maximum Number of Files in a Directory

As of September 2016, there have been three instances of researchers hitting the OAFS limit of about 64K files in a directory, which hampers their work. The AuriStorFS limit is about 20 million, about 310 times as many. This situation is expected to get worse quickly, as researchers generate an increasingly large number of files.

III.6 AuriStorFS and NetApp

  • Unlike the NetApp filesystem, and other such hardware-software combinations, OAFS / AuriStorFS is not dependent on any particular hardware; whereas the NetApp filesystem requires NetApp hardware, OAFS / AuriStorFS works on virtually any storage device. Thus, use of OAFS / AuriStorFS avoids hardware vendor lock-in.
  • With NetApp, the design of the structured name space must be done at the start of the implementation. That design is essentially locked in.
  • NetApp's NFS does not support rpc.rquotad, which means the standard "edquota" command cannot be used to adjust user quotas on Linux systems that mount the filesystem. Instead, administrators need to ssh to the NetApp device (a different system than the one mounting the filesystem) to modify quotas, or create a utility to do so. In contrast, an AFS administrator can adjust quotas from any system in the cell, regardless of what system (or device) the files are actually on. This is an example of how OAFS / AuriStorFS insulates administration from the particular underlying hardware.

III.7 AuriStor and the Linux kernel

It is likely that the AFS/AuriStor client (kAFS) will be integrated in the near future into the Fedora kernel, and eventually into RHEL. This is a very strong validation of AFS. kAFS Hackathon

III.8 AuriStorFS and Standards

Definitions for "standard" and "native" are important. Does "standard" mean a protocol which is published by a standards body such as ISO or IETF? There are plenty of examples of standards such as NFSv4 which are designed to provide a false sense of interoperability and portability.

NFSv4 defines a minimal set of requirements and a very large set of optional behaviors. Finding the overlapping set between clients and servers (especially servers from different vendors) is difficult.

Or "standard" could mean "ubiquitous" such as CIFS/SMB. However, there are many variants of CIFS/SMB protocol with wide-ranging differences. Most non-Microsoft implementations are at least one major Windows OS revision behind and Microsoft has frequently broken interoperability by disabling support for older variants in more modern OS releases.

The Windows OAFS client used to be an SMB/AFS proxy server. The reason for building the kernel mode OAFS redirector (aka the "native" client) is that Microsoft removed the CIFS support for the variant of SMB that the OAFS SMB/AFS proxy server implemented.

To a file system developer "native" means a file system that is implemented in the OS as a kernel mode driver that integrates with the OS vfs/ifs layer. By that definition "AFS" clients are native to Linux, MacOSX, and Windows.

One of the challenges of relying on OS vendor-shipped functionality is that it becomes imperative to upgrade the OS to get access to new features and capabilities. OS vendors rarely backport new functionality to prior OS releases.

Related to "standards" is the notion of enterprise-wide file systems at educational research institutions. In general, enterprise-wide file systems are not at such institutions. Instead, there is a high degree of balkanization, with different filesystems and storage methods deployed in different departments, and within departments.

III.9 AuriStor and OpenHPC

ARCS is in the process of moving provisioning and management of HPC clusters to the OpenHPC (OHPC) model OpenHPC.

OpenHPC utilizes the SLURM scheduling and resource manager, which has a plugin (AUKS) to manage Kerberos credentials automatically. ARCS modified the plugin code to also generate AFS tokens, thus enabling a single file name space for all HPC components - including clusters - as well as for all other AFS clients.

This single name space is possible only by using AFS for home directories. A user logged in to either Kong or Stheno (and any future clusters) will see exactly the same directories and files (and these directories and files will be identical to those seen on any other AFS client). This capability is not possible using NFS-mounted home directories, which are separate filesystems for each cluster.

III.10 AuriStor and cross-directory hard links

OpenAFS does not support cross-directory hard links. Various software packages expect this capability. AuristorFS does support cross directory hard links.

III.11 AuriStor and O_DIRECT

AuriStorFS supports "O_DIRECT" mode; OpenAFS does not. O_DIRECT allows files that are being written to on an AFS compute node to be visible on all other compute nodes, as well as on the login server. This capability is essential for HPC applications, since the user needs to be able to monitor output files running on compute nodes from the login server.

III.12 AuriStor and HIPAA Compliance

III.13 Considerations of replacement of OAFS with a variety of applications and/or filesystems: Experience at other institutions

  1. Experience at JPL, from the 2015 OAFS and Kerberos Workshop
  2. JPL is staying with OAFS for now.

    • AFS has one or more features or combinations of features that are not available with other available file systems
    • Organizations lose track of what AFS is doing and what AFS can do
    • Believing that AFS does "just X" (shares files) or is "just Y" (cross platform), niche solutions are implemented (SharePoint, other web based tools, drag and drop file transfer, home grown file system synchronization, NFS ...)
    • File systems are not "sexy" and not well understood by management; once implemented and integrated into the IT environment, AFS need rarely be discussed
    • Piecewise replacement of AFS is an inadequate approach, results in proliferation of costs due to support variety of special-purpose solutions
    • Some loss of institutional control of data for variety of reasons -- historical protections (ACLs) not transferred, format different, loss of Kerberos, different application controls, etc.
    • Cost, time, and security of moving off AFS all unknown. The writer, K. Kimball, expects that they will never leave AFS because no formal cost/risk analysis of use cases was performed
  3. Conference call with Jason Cowart, Stanford Central IT on 1-Mar-2016
    • Would like have already migrated to AuriStorFS except for turnover of about 70% of Central IT services personnel
    • In holding pattern right now, continuing to use OAFS, with service contract by Sine Nomine
    • Expecting re-evaluation of the situation in a year

IV. Next steps

Based on the current state of research, including extensive discussions with AuriStorFS users :

  • iImmediately purchase AuriStorFS license for at least the cad.njit.edu cell. This will provide solid support in case of security, client, or server problems with OAFS. It will also provide needed capacity enhancemenys : as of September 2016, there have been three instances of researchers hitting the OAFS limit of about 64K files in a directory, which hampers their work. The AuriStorFS limit is about 20 million, about 310 times as many. This situation is expected to get worse quickly, as researchers generate an increasingly large number of files.

V. Costs

Moving from OAFS to AuriStorFS :

  • AuriStorFS licensing costs are specified under XXXXXX
  • This move would free up VM resources (28 fileservers -> 3 or 4 fileservers), and would result in an increase in performance
  • Estimated staff-hours : ??

Moving off of OAFS to an assortment of other technologies, not as yet identified :

  • As of 4/7/2016, the cad.njit.edu cell had 32,851 volumes, containing 144,553,379 files, using 25.6TB of disk, woven deeply into academics, research, web, database, system administration, and other services
  • It is not possible to estimate the level of effort that would be required to move off of OAFS, except that it would be many orders of magnitude greater than that required to move from OAFS to AuriStorFS

VI. Conclusions

  1. AFS is a critical and integral part of NJIT's academic and research infrastructure. It is deeply embedded, with over 20 years of reliable deployment. Trying to replace it with an assortment of other technologies would take an extraordinary amount of effort and money, and would result in a significantly inferior service.
  2. The point has been reached where the licensing of AurisStorFS is the rational and feasible course, in line with the purchase of support for other critical infrastructure applications.

Addenda

A. CST 19 May 2016 Document - Sites

CSTMay19DocSites

B.. CST 19 May 2016 Document - Recommendation

CSTMay19DocRecommendation

Filesystem comparison

C. July 22 2016 Meeting

July22-2016Meeting.1

D. August 18 2016 Meeting

18Aug-2016Meeting

E. Merging of cad.njit.edu and uis.njit.edu cells

Merge

F. Indiana University Research Technologies

IUresearch

G. CERN

CERN