« Home « Kết quả tìm kiếm

Grid Computing P11

- Such lessons were not lost on the system designers of the early 1980s.
- In contrast to the dominant centralized control model of the day, Condor was unique in its insistence that every participant in the system remain free to contribute as much or as little as it cared to..
- The Condor system soon became a staple of the production-computing environment at the University of Wisconsin, partially because of its concern for protecting individ- ual interests [15].
- Many previous publications about Condor have described in fine detail the features of the system.
- In this chapter, we will lay out a broad history of the Condor project and its design philosophy.
- 11.2 THE PHILOSOPHY OF FLEXIBILITY.
- Each of these emphasizes a particular aspect of the discipline, but is united by fundamental concepts.
- 11.3 THE CONDOR PROJECT TODAY.
- The actual development and deployment activities of the Condor project are a critical ingredient toward its success..
- As a result, a portion of the project resembles a software company.
- Two versions of the software, a stable version and a development version, are simultaneously developed in a multiplatform (Unix and Windows) environment.
- 11.3.1 The Condor software: Condor and Condor-G.
- When most people hear the word ‘Condor’, they do not think of the research group and all of its surrounding activities.
- Some of the enabling mechanisms of Condor include the following:.
- Figure 11.1 The available capacity of the UW-Madison Condor pool in May 2001.
- Notice that a significant fraction of the machines were available for batch use, even during the middle of the work day.
- The result is very beneficial for the end user, who is now enabled to utilize large collections of resources that span across multiple domains as if they all belonged to the personal domain of the user..
- Over the history of the Condor project, the fundamental structure of the system has remained constant while its power and functionality has steadily grown.
- At the agent, a shadow is responsible for providing all of the details necessary to execute a job.
- Later in this chapter, we will return to examine the other components of the kernel..
- Figure 11.3 The Condor Kernel.
- Each of the three parties – agents, resources, and matchmakers – are independent and individually responsible for enforcing their owner’s policies.
- Combining all of the machines into one Condor pool was not a possibility because each organization wished to retain existing community policies enforced by established matchmakers.
- If a gateway detects idle agents or resources in its home pool, it passes them to its peer, which advertises them in the remote pool, subject to the admission controls of the remote matchmaker.
- This is a map of the worldwide Condor flock in 1994.
- Another disadvantage is that Condor-G does not support all of the varied features of each batch system underlying GRAM.
- Many of the RMSs we have mentioned contain powerful scheduling components in their architecture..
- In the third step, the match- maker informs both parties of the match.
- The responsibility of the matchmaker then ceases with respect to the match.
- it will actually run is at the mercy of the remote scheduler (see Figure 11.8).
- Matchmaking emerged over several versions of the Condor software.
- Indirect references [61] permit one ClassAd to refer to another and facilitate the construction of the I/O communities mentioned above..
- A problem solver is a higher-level structure built on top of the Condor agent.
- Thus, any of the structures we describe below may be applied to an ordinary Condor pool or to a wide-area Grid computing scenario..
- searches where large portions of the problem space may be examined independently, yet the progress of the program is guided by intermediate results..
- The user must extend the classes to perform the necessary application-specific worker processing and master assign- ment, but all of the necessary communication details are transparent to the user..
- DAGMan might be thought of as a distributed, fault-tolerant version of the traditional make .
- Jobs may fail because of the nature of the distributed system.
- The rescue DAG is a new DAG listing the elements of the original DAG left unexecuted.
- For example, a corrupted executable or a dismounted file system should be detected by the distributed system and retried at the level of the agent.
- So far, this chapter has explored many of the techniques of getting a job to an appropriate execution site.
- However, that only solves part of the problem.
- None of this is made known outside of the agent until the actual moment.
- 11.7.1 The standard universe.
- The standard universe was the only universe supplied by the earliest versions of Condor and is a descendant of the Remote UNIX [14] facility..
- The goal of the standard universe is to faithfully reproduce the user’s home POSIX environment for a single process running at a remote site.
- Figure 11.15 shows all of the components necessary to create the standard universe..
- It prepares the machine by creating a temporary directory for the job, and then fetches all of the job’s details – the executable, environment, arguments, and so on – and places them in the execute directory.
- It provides all of the job details for the sandbox and makes all of the necessary policy decisions about the job as it runs.
- The library converts all of the job’s standard system calls into secure remote procedure calls back to the shadow.
- It is vital to note that the shadow remains in control of the entire operation.
- This maximizes the flexibility of the user to make run-time decisions about exactly what runs where and when..
- 11.7.2 The Java universe.
- The components of the.
- It must ask the shadow for all of the job’s details, just as in the standard universe..
- However, the location of the JVM is provided by the local administrator, as this may change from machine to machine.
- This unencrypted connection is secure by making use of the loopback network interface and presenting a shared secret.
- The sandbox then executes the job’s I/O requests along the secure RPC channel to the shadow, using all of the same security mechanisms and techniques as in the standard universe..
- However, there are several advantages of the I/O proxy over the more direct route used by the standard universe.
- For example, if a firewall lies between the execution site and the job’s storage, the sandbox may use its knowledge of the firewall to authenticate and pass through.
- Likewise, the user may provide credentials for the sandbox to use on behalf of the job without rewriting the job to make use of them..
- In addition to all of the usual failures that plague remote execution, the Java environment is notoriously sensitive to installation problems, and many jobs.
- Micron Technology, Inc., has established itself as one of the leading worldwide providers of semiconductor solutions.
- This mission is exemplified by short cycle times, high yields, low production costs, and die sizes that are some of the smallest in the industry.
- The 70 Linux machines are all dual-CPU and mostly reside on the desktops of the animators.
- makes considerable use of the schema-free properties of ClassAds by inserting custom attributes into the job ClassAd.
- To combat this computation hurdle, a parallel implementation of the solver was devel- oped which fit the master–worker model.
- We would like to acknowledge all of the people who have contributed to the development of the Condor system over the years.
- Proceedings of the International Symposium on Computer Performance, Modeling, Measurement and Evaluation, Yorktown Heights, New York, August, 1977, pp..
- Proceedings of the Second International Conference on Distributed Computing Systems, Paris, France, April, 1981, pp.
- Commu- nications of the ACM .
- (1979) Systems aspects of the Cambridge Ring.
- Proceedings of the Seventh Symposium on Operating Systems Principles Pacific Grove, CA, USA, 1979, pp.
- Proceedings of the 9th Symposium on Operating Systems Principles (SOSP), November, 1983, pp.
- Communications of the ACM .
- Pro- ceedings of the IEEE Workshop on Experimental Distributed Systems, October, 1990..
- Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing (HPDC7), July, 1998..
- Proceedings of the Ninth IEEE Symposium on High Performance Distributed Computing (HPDC9), Pittsburgh, PA, August, 2000, pp.
- Proceedings of the 24th International Conference on Parallel Processing, Oconomowoc, WI, pp.
- Proceed- ings of the Workshop on Cluster Computing, 1992..
- (2001) Core algorithms of the Maui scheduler.
- Pro- ceedings of the 7th Workshop on Job Scheduling Strategies for Parallel Processing, 2001..
- (1997) Astronomical and Biochemical Origins and the Search for Life in the Universe, Proceedings of the 5th International Conference on Bioastronomy.
- Proceedings of the Second Workshop on Environments and Tools for Parallel Scientific Computing, May, 1994..
- Pro- ceedings of the 11th IEEE Symposium on High Performance Distributed Computing (HPDC), July, 2002..
- Proceedings of the IPPS/SPDP Workshop on Job Scheduling Strategies for Parallel Processing, 1988, pp.
- Proceedings of the 5th ACM Conference on Computer and Communications Security Conference, 1998, pp.
- Proceedings of the Seventh Heterogeneous Computing Workshop, March .
- Proceedings of the Tenth IEEE Symposium on High Performance Distributed Computing (HPDC), San Francisco, CA, August .
- Proceedings of the Conference on Computing in High Energy Physics 2001 (CHEP01), Beijing, September, 3 – 7..
- Proceedings of the Ninth IEEE Sym- posium on High Performance Distributed Computing (HPDC9), Pittsburgh, PA, August, 2000, pp.
- Proceedings of the Eleventh IEEE Symposium on High Performance Distributed Computing, Edinburgh, Scotland, July, 2002..
- Proceedings of the 11th IEEE Symposium on High Per- formance Distributed Computing (HPDC-11).
- Proceedings of the 5th Princeton Symposium on Infor- mation Sciences and Systems, ACM Operating Systems Review .
- Proceedings of the IEEE .
- Proceedings of the 15th ACM Symposium on Operating Systems Principles, December, 1995, pp.
- Proceedings of the 2nd USENIX Symposium on Operating Sys- tems Design and Implementation (OSDI), October, 1996, pp.
- Proceedings of the 11th USENIX Security Symposium, San Francisco, CA, August, 2002..
- Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing (HPDC8), 1999..
- Proceedings of the Tenth IEEE Symposium on High Performance Distributed Computing (HPDC10), San Francisco, CA, August

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt

Grid Computing P11

CHỦ ĐỀ LIÊN QUAN