We use a sample of 8298 galaxies observed as part of the Hubble Space Telescope (HST) H160‐band GOODS NICMOS Survey (GNS) to construct the galaxy stellar mass function both as a function of redshift and as stellar mass up to z= 3.5. Our mass functions are constructed within the redshift range z= 1–3.5 and consist of galaxies with stellar masses of M*= 1012 M⊙ down to nearly dwarf galaxy masses of M*= 108.5 M⊙ in the lowest redshift bin. We discover that a significant fraction of all massive M* > 1011 M⊙ galaxies are in place up to the highest redshifts we probe, with a decreasing fraction of lower mass galaxies present at all redshifts. This is an example of ‘galaxy mass downsizing’, and is the result of massive galaxies forming before lower mass ones, and not just simply ending their star formation earlier as in traditional downsizing scenarios, whose effect is seen at z < 1.5. By fitting Schechter functions to our mass functions we find that the faint‐end slope ranges from α=−1.36 to −1.73, which is significantly steeper than what is found in previous investigations of the mass function at high redshift. We demonstrate that this steeper mass function better matches the stellar mass added due to star formation, thereby alleviating some of the mismatch between these two measures of the evolution of galaxy mass. We furthermore examine the stellar mass function divided into blue/red systems, as well as for star‐forming and non‐star‐forming galaxies. We find a similar mass downsizing present for both blue/red and star‐forming/non‐star forming galaxies, and further find that red galaxies dominate at the high‐mass end of the mass function, but that the low‐mass galaxies are mostly all blue, and therefore blue galaxies are creating the steep mass functions observed at z > 2. We furthermore show that, although there is a downsizing such that high‐mass galaxies are nearer their z= 0 values at high redshift, this turns over at masses M*∼ 1010 M⊙, such that the lowest mass galaxies are more common than galaxies at slight higher masses, creating a ‘dip’ in the observed galaxy mass function. We argue that the galaxy assembly process may be driven by different mechanisms at low and high masses, and that the efficiency of the galaxy formation process is lowest at masses M*∼ 1010 M⊙ at 1 < z < 3. Finally, we calculate the integrated stellar mass density for the total, blue and red populations. We find the integrated stellar mass density of the total and blue galaxy population is consistent with being constant over z= 1–2, while the red population shows an increase in integrated stellar mass density over the same redshift range.