A three dimensional nonlinear rigid-viscoplastic metal forming Finite Element (FE) simulation package, ALPID-3D, is being developed to run on distributed memory Multiple Instruction Multiple Data (MIMD) parallel computers. Efficient parallelization of this application requires automatic partitioning of the finite element domain into sub-domains and solving the compute intensive parts of these sub-domains concurrently while minimizing inter-processor communication. Domain decomposition of the FE graph is accomplished by creating an Element Interaction Graph (EIG) and partitioning the EIG into sub-graphs. The most compute intensive part of any FE analysis consists of the generation and solution of FE matrix governing equations. In order to minimize the communication overhead during the solution of these equations, a Coarse Grain Element By Element Preconditioned Conjugate Gradient (CG-EBE-PCG) method is used. Experimentally measured performance on a Meiko i860 system is reported.