Monday, December 28, 2009

Wishbone: Profile-based Partitioning for Sensornet Applications

Two major problems with application partitioning:
  • Heterogeneity
  • Decomposition
The requirements for wishbone applications
  1. Streaming dataflow model: the model should be a dataflow graph
  2. Predictable input rates and patterns: because they use profiling they need to set this constraint
The front-end creates a dataflow graph. The backend performs graph optimization and reduces work functions to an intermediate language that can be fed to a number of code generators.

namespace is used to logically define the distribution of the code, i.e., the code that can be distributed, not the code that necessarily needs to be distributed.

if the code to be placed (logically) in a node is stateful, the state of the stateful operators should be replicated on the node too.

Stateful server oprators can not be moved to the network, however, stateful node operators can be brought to the server.

The system considers two modes of conservative and permissive where in the conservative mode the stateful nodes are not pushed to the server but in the permissive mode, the stateful operators can be pushed to the server, in case the application is capable of dealing with data loss.

For dataflow, the Scheme compiler executes the code during the compilation to measure the data flow, producing platform independent data rates.

Once partitoned, the partition is executed within simulated or real hardware to measure the cpu foot print for the partition. Timing statements are placed at the beginning and end of each operation. The timestamp can help with extracting the memory footprint for each piece of the code.

Cost is measured using Cost = aC + bNet

The ILP algorithm used is minimum cost cut for partitioning the graph

Wednesday, December 16, 2009

Dynamic Function Placement for Data-intensive Cluster Computing

Application partitioning is difficult because of :
  • variation in application behavior
  • variability in resource availability
  • availability in workload mixes
Effective use of cluster resources require
  1. load balancing
  2. proper partitioning of functionality among producers and consumers
Function placement is done in abacus only based on black box monitoring removing the burden from the programmers to worry about function placement.

Abacus consists of a programming model and a runtime system. In the abacus programming model, the programmers need to define their components as explicitly migratable functionally independent components or objects.

Anchored elements need to be explicitly defined in the graph of the application. I think this is required because when it comes to modeling the grapho for the application, these components should be makred properly.

Abacus components:
  1. Migration and Location Transparent Invocation Component (Binding Manager)
  2. Resource Monitoring and Management Component (Resource Manager)

Resource Manager uses notifications to collect monitoring information (mointoring and profiling happens during runtime).

The best net benefit is calculated by the server in order to determine whether it is worth doing the migration (minimum requirements for doing the migration). Code Mobility and Dynamic Linking are sidestep in this model.

Mobile Objects are defined by the programmer.

Cluster characteristics critical for function placement:
  • Communication bandwidth between nodes
  • Relative processor speed among nodes
  • Workload characteristics (e.g., bytes moved among functions, instructions executed by each function)
-> Data Intensive Applications: those that selectively filter, mine, sort, or otherwise manipulate large data sets. Spread the parallel computations across the source/sink servers.

Programmable Storage Services. is what they consider as a potential alternative to Cloud when it comes to naming.

Difference between Coign and Abacus is that Coign relies on the profiling history of functions / components to make decisions, while Abacus tries to do it at runtime.

Equanimity dynamically balances the load between a single client and its servers. Abacus extends it to real world clusters, i.e., resource contention, resource heterogeneity, workload variation.

Dynamic adaptation of resource placement based on resource usage and availability.

The two applications used in Abacus:
  1. The file system
  2. The search application

Goals for Abacus:
  1. improve overall performance

Parameters measured:
  • Data Flow Graph
  • Memory Consumption
  • Instructions Executed per Byte
  • Stall Time

Sunday, December 13, 2009

The Coign Automatic Distributed Partitioning System

The problem:

The need to partition and place pieces of applications on different nodes. Considering the effort, repartitioning is not done frequently because of the required effort even though repartitioning may buy a lot of efficiency for the application.

Application reprofiling is supported based on the periodical profiling of the application and calculating the optimal solution.

The architecture for Coign:

  • The application is augmented with instrumentations for Coign using the binary re-writer.
  • The instrumented binary is run through a set of profiling scenarios (degrading application performance. inter-component communications are summarized)
  • The profile analysis engine combines component communication profiles and component location constraints to create an abstract inter-component communication graph (ICC).
  • Location constraints are obtained from the programmer, from analysis of component communication records, and from application binaries.
  • The ICC graph is combined with a network profile to create a graph of potential communication time on the network
  • The graph cutting algorithm: lift-to-front minimum cut

The set of components for Coign Runtime:


Instance Classifier is probably the most important part of Coign Runtime. This is probably the most important part of the profiler as it tries to identify similarities between instances and extracted profiles. They have listed the following classifiers which need to be further investigated:
  1. incremental classifier (Straw man classifier)
  2. Procedure Called-By Classifier (PCB)
  3. Static Type Classifier (ST)
  4. Static Type Called-By Classifier (STCB)
  5. Internal-function Called-By Classifier (IFCB)
  6. Entry Point Called-By Classifier (EPCB)
  7. Instantiated-By Classifier (IB)

The next step is correlating the profile of one instance with another instance based on similar resource usage and communication behavior. They have used instance communication vector.

The algorithm that it uses is the lift-to-front minimum cut graph cutting algorithm