Exploring Group-Object in PowerShell


At some point in our ever-expanding PowerShell learning curve we’ve probably wanted to split a list based on a property, and written something along these lines:

$UsersWithX = @()
$UsersWithoutX = @()
foreach ($User in $Users) {
    if ($User.PropertyX -eq 'VerySpecificValue') {
        $UsersWithX += $User
    }
    else {
        $UsersWithoutX += $User
    }
}

It could be that we just retrieved a lot of AD users and we’re sorting them into lists based on whether they’re member of a specific AD group, or maybe we’re looking at server objects and checking how many there are with old operating systems.

You may be thinking “That could use some Where-Object!”, but even then we would end up in a situation where we make the same comparison twice. Let’s use services as an example.

# Make sure to only gather the data once
$Services = Get-Service
$RunningServices = $Services | Where-Object Status -eq 'Running'
$StoppedServices = $Services | Where-Object Status -eq 'Stopped'

So while we definitely improved readability, you could argue that the code is less optimal as it has to evaluate the entire list of users twice. Not to say that the cmdlet is bad, sometimes it’s outclassed and sometimes it’s an amazing tool. But for our specific problem above, do we have any alternatives?

Group-Object

Group-Object in PowerShell is an underestimated cmdlet which lets us solve the problem in a single command instead.

PipeHow:\Blog> $Services = Get-Service | Group-Object -Property Status
PipeHow:\Blog> $Services

Count Name    Group
----- ----    -----
  158 Stopped {...}
  116 Running {...}

Grouping objects on a property makes it easy to structure data, and instead of making several lists we can simply use the resulting list of groups. Each group contains the elements matching whatever logic we grouped them by, and we immediately get a glance at how many they are and what they have in common by looking at the name.

Group Info

In our example above we grouped services by status, and we see that my current computer has 116 running services and 158 stopped ones. Running Get-Member on a group in our collection shows us that each group object contains four properties that hold data about our objects.

PipeHow:\Blog> $Services[0] | Get-Member -MemberType Property

   TypeName: Microsoft.PowerShell.Commands.GroupInfo

Name   MemberType Definition
----   ---------- ----------
Count  Property   int Count {get;}
Group  Property   System.Collections.ObjectModel.Collection[psobject] Group {get;}
Name   Property   string Name {get;}
Values Property   System.Collections.ArrayList Values {get;}

We can see that after grouping our list of services it no longer contains services, and examining the first object in the list shows us that it now contains objects of the type GroupInfo. Each object has four properties about the objects grouped together based on the property or logic that we provided to the command.

The first three properties are pretty straight-forward, but Values may not be as clear yet. To explain what I mean by “a collection of all values we grouped the object by” let’s take a look at the syntax of the cmdlet.

PipeHow:\Blog> (Get-Command Group-Object -Syntax) -replace ' \[',"`n["

Group-Object
[[-Property] <Object[]>]
[-NoElement]
[-AsHashTable]
[-AsString]
[-InputObject <psobject>]
[-Culture <string>]
[-CaseSensitive]
[<CommonParameters>]

Don’t mind the regex magic going on, I simply add a line break before each parameter for readability.

We can see that there are a few different ways to use the cmdlet. I will go through most of the parameters throughout the post, starting with the ones we’ve used so far. When using the pipeline as we have done, the InputObject parameter is connected to whatever we send through the pipeline, and other than that we’ve only really used the Property parameter which takes an array of objects.

An array of objects… Doesn’t that strike you as interesting? Not only does it mean that it can take several values and doesn’t necessarily expect just a single string with the name of a property to match by, it also means that it doesn’t even have to be a string!

Custom Grouping Logic

That’s right, we can create custom grouping logic using whatever code we want, as long as it results in an object of some type. To showcase this, let’s get and group all processes on my computer by:

We will do this by creating a scriptblock for each grouping condition, which in turn will evaluate to a string that will become the group name.

$Now = Get-Date

# If the first character of the name is the same as the first character of today's weekday
$GroupCondition1 = {
    if ($_.Name[0] -eq $Now.DayOfWeek.ToString()[0]) {
        'Name=Today'
    }
    else {
        'Name!=Today'
    }
}

# If the process has run for at least two hours
$GroupCondition2 = {
    if ($_.StartTime -lt $Now.AddHours(-2)) {
        'Run > 2h'
    }
    else {
        'Run < 2h'
    }
}

# If the process has a reference to file (an executable)
$GroupCondition3 = {
    if ([string]::IsNullOrWhitespace($_.Path)) {
        'Is .exe'
    }
    else {
        'Is not .exe'
    }
}

# We can still group by the name of a property on the object
$GroupCondition4 = 'Responding'

$GroupConditions = ($GroupCondition1,$GroupCondition2,$GroupCondition3,$GroupCondition4)
$GroupedProcesses = Get-Process | Group-Object $GroupConditions

$GroupedProcesses | Sort-Object Count -Descending

Count  Name                                       Group
-----  ----                                       -----
      129  Name!=Today, Run > 2h, Is not .exe, True   {...}
      122  Name!=Today, Run > 2h, Is .exe, True       {...}
        5  Name!=Today, Run < 2h, Is not .exe, True   {...}
        5  Name=Today, Run > 2h, Is .exe, True        {...}
        1  Name!=Today, Run > 2h, Is not .exe, False  {...}
        1  Name=Today, Run > 2h, Is not .exe, True    {...}

Using our conditions with the positional parameter Property we can see that, among other groups, 129 of the processes on my computer do not start with a ‘W’ (this code is run on a Wednesday), have run for over two hours, do not have an executable file connected to them and are responding. The problem with simply grouping by a property such as Responding that is either true or false, compared to the status of a service for example, is that you only get the literal true or false value in the GroupInfo object and no information of what it represents.

This is why I like making the grouping scriptblocks return strings that are at least somewhat more clear as for what they mean. You could also attach custom properties to the objects and group by those, using for example Select-Object.

Getting back to the Values property, you can now probably see why it’s a list.

PipeHow:\Blog> $GroupedProcesses[0].Values

Name!=Today
Run < 2h
Is not .exe
True

It’s simply a collection of each property, or logic that we grouped by, and the result of it for this group. It’s what makes this group different from the other ones. The difference between this and the name is that this list has the actual values split into a list while the name is a concatenated string for readability.

NoElement

Let’s say we weren’t at all interested in what processes were in each group, we simply wanted a quick measurement of the custom grouping of our list. The parameter NoElement takes care of that and gives us a trimmed result.

PipeHow:\Blog> Get-Service | Group-Object Status -NoElement

Count Name
----- ----
  160 Stopped
  114 Running

Using this parameter instead gives us objects of the type GroupInfoNoElement with no objects in the Group property, which is handy when we simply want a quick way to measure data and are not interested in the actual objects. It seems like two services stopped running during the time I wrote this blog post, unfortunately I’ll never know which ones, since as long as I use this parameter I only get the count of each group.

AsHashTable

If we turn it around from NoElement where we don’t get the data of the grouped objects, you could say that AsHashTable does the exact opposite. It omits the Count property and creates a hashtable with the name of each group as the key and a list of all the grouped objects as the value. You can of course still count them using the properties of the hashtable, but this lets you access the data in a very quick and structured way.

PipeHow:\Blog> $ServiceHash = Get-Service | Group-Object Status -AsHashTable
PipeHow:\Blog> $ServiceHash['Running'].Count
0

We created a hashtable, but something strange occurred. When trying to access the collection of running services we find none. This is because the keys of a hashtable are not restricted to being only strings. In our case, the Status property that we group by is not actually a string, as can be seen if we inspect a service object.

PipeHow:\Blog> Get-Service | Get-Member Status

   TypeName: System.ServiceProcess.ServiceController

Name   MemberType Definition
----   ---------- ----------
Status Property   System.ServiceProcess.ServiceControllerStatus Status {get;}

PipeHow:\Blog> [enum]::GetValues([System.ServiceProcess.ServiceControllerStatus])
Stopped
StartPending
StopPending
Running
ContinuePending
PausePending
Paused

It’s actually an enum! While PowerShell does it’s best to present it as a readable string to us in the console, an enum is actually a data structure of its own. If you’re interested in reading more about them I wrote a post on classes and enums in PowerShell.

We can however combine AsHashTable with another parameter called AsString to force PowerShell to convert the keys to strings. Note that this may not always give you the result you expect, so if any problems occur when formatting the keys I would suggest you make sure to return a custom string in the group condition instead.

PipeHow:\Blog> $ServiceHash = Get-Service | Group-Object Status -AsHashTable -AsString
PipeHow:\Blog> $ServiceHash['Running'].Count
114

CaseSensitive

As with most things in PowerShell, Group-Object is not case-sensitive by default. We can however make sure that it is, by specifying the CaseSensitive parameter.

Let me demonstrate how it works by creating a couple of nonsense objects with names and values.

$TestLowerCase = [pscustomobject]@{
    'Name' = 'test'
    'Value' = 'lowercase'
}

$TestUpperCase = [pscustomobject]@{
    'Name' = 'TEST'
    'Value' = 'uppercase'
}

$TestTitleCase = [pscustomobject]@{
    'Name' = 'Test'
    'Value' = 'titlecase'
}

$TestList = $TestLowerCase,$TestUpperCase,$TestTitleCase

These three objects all have the name “test” but in different casing. Using the CaseSensitive parameter lets us differentiate between them in the groups.

PipeHow:\Blog> $TestList | Group-Object Name

Count Name Group
----- ---- -----
    3 test {@{Name=test; Value=lowercase}, @{Name=TEST; Value=uppercase}, @{Name=Test; Value=titlecase}}

PipeHow:\Blog> $TestList | Group-Object Name -CaseSensitive

Count Name                      Group
----- ----                      -----
    1 test                      {@{Name=test; Value=lowercase}}
    1 TEST                      {@{Name=TEST; Value=uppercase}}
    1 Test                      {@{Name=Test; Value=titlecase}}

As keys in a hashtable need to be unique, does this also work with the AsHashTable parameter? Only if you are running PowerShell 7 or up, in earlier PowerShell versions we will get an error about key duplication.

Beginning in PowerShell 7, Group-Object can combine the CaseSensitive and AsHashtable parameters to create a case-sensitive hash table. The hash table keys use case-sensitive comparisons and output a System.Collections.Hashtable object.

I hope you learned something new about grouping data in PowerShell, and hopefully you can improve some of those old nested comparison loops!

Comments

comments powered by Disqus