A novel coordination framework between the segmentation and the recognition is proposed, to conduct the two tasks collaboratively and iteratively. To accomplish the cooperation, objects are expressed in two aspects: shape and appearance, which are learned and leveraged as constraints to the segmentation so that the object segmentation mask will be consistent with the object regions in the image and the knowledge we have. For the shape, a bottom-top-bottom pathway is built using an encoder-decoder network with capsule neurons, where the encoder extracts the features of the shape that used for recognition and the decoder generates reference shapes according to these features and the recognition result. During this procedure, capsule neurons can parse the existence of the object and cope with the interference in the segmentation. The appearance knowledge is utilized in another pathway to assist the segmentation processing. Both the shape and appearance information are dependent on the recognition result, thus allowing the classifier to convey object information to the segmenter. Experiments demonstrate the effectiveness of our framework and model in collaboratively segmenting and recognizing objects that can be recognized using their shapes/shape-patterns.