Command Section

PMC.ATOMSILVERMONT(3)  FreeBSD Library Functions Manual  PMC.ATOMSILVERMONT(3)

NAME
     pmc.atomsilvermont - measurement events for Intel Atom Silvermont family
     CPUs

LIBRARY
     Performance Counters Library (libpmc, -lpmc)

SYNOPSIS
     #include <pmc.h>

DESCRIPTION
     Intel Atom Silvermont CPUs contain PMCs conforming to version 3 of the
     Intel performance measurement architecture.  These CPUs contains two
     classes of PMCs:

     PMC_CLASS_IAF     Fixed-function counters that count only one hardware
                       event per counter.

     PMC_CLASS_IAP     Programmable counters that may be configured to count
                       one of a defined set of hardware events.

     The number of PMCs available in each class and their widths need to be
     determined at run time by calling pmc_cpuinfo(3).

     Intel Atom Silvermont PMCs are documented in Combined Volumes, Intel 64
     and IA-32 Intel(R) Architecture Software Developer's Manual, Order Number
     325462-050US, Intel Corporation, February 2014.

   ATOM SILVERMONT FIXED FUNCTION PMCS
     These PMCs and their supported events are documented in pmc.iaf(3).

   ATOM SILVERMONT PROGRAMMABLE PMCS
     The programmable PMCs support the following capabilities:

     Capability           Support
     PMC_CAP_CASCADE      No
     PMC_CAP_EDGE         Yes
     PMC_CAP_INTERRUPT    Yes
     PMC_CAP_INVERT       Yes
     PMC_CAP_READ         Yes
     PMC_CAP_PRECISE      No
     PMC_CAP_SYSTEM       Yes
     PMC_CAP_TAGGING      No
     PMC_CAP_THRESHOLD    Yes
     PMC_CAP_USER         Yes
     PMC_CAP_WRITE        Yes

   Event Qualifiers
     Event specifiers for these PMCs support the following common qualifiers:

     any     Count matching events seen on any logical processor in a package.

     cmask=value
             Configure the PMC to increment only if the number of configured
             events measured in a cycle is greater than or equal to value.

     edge    Configure the PMC to count the number of de-asserted to asserted
             transitions of the conditions expressed by the other qualifiers.
             If specified, the counter will increment only once whenever a
             condition becomes true, irrespective of the number of clocks
             during which the condition remains true.

     inv     Invert the sense of comparison when the "cmask" qualifier is
             present, making the counter increment when the number of events
             per cycle is less than the value specified by the "cmask"
             qualifier.

     os      Configure the PMC to count events happening at processor
             privilege level 0.

     usr     Configure the PMC to count events occurring at privilege levels
             1, 2 or 3.

     If neither of the "os" or "usr" qualifiers are specified, the default is
     to enable both.

     Events that require core-specificity to be specified use a additional
     qualifier "core=core", where argument core is one of:

     all     Measure event conditions on all cores.

     this    Measure event conditions on this core.

     The default is "this".

     Events that require an agent qualifier to be specified use an additional
     qualifier "agent=agent", where argument agent is one of:

     this    Measure events associated with this bus agent.

     any     Measure events caused by any bus agent.

     The default is "this".

     Events that require a hardware prefetch qualifier to be specified use an
     additional qualifier "prefetch=prefetch", where argument prefetch is one
     of:

     both     Include all prefetches.

     only     Only count hardware prefetches.

     exclude  Exclude hardware prefetches.

     The default is "both".

     Events that require a cache coherence qualifier to be specified use an
     additional qualifier "cachestate=state", where argument state contains
     one or more of the following letters:

     e       Count cache lines in the exclusive state.

     i       Count cache lines in the invalid state.

     m       Count cache lines in the modified state.

     s       Count cache lines in the shared state.

     The default is "eims".

     Events that require a snoop response qualifier to be specified use an
     additional qualifier "snoopresponse=response", where argument response
     comprises of the following keywords separated by "+" signs:

     clean   Measure CLEAN responses.

     hit     Measure HIT responses.

     hitm    Measure HITM responses.

     The default is to measure all the above responses.

     Events that require a snoop type qualifier use an additional qualifier
     "snooptype=type", where argument type comprises the one of the following
     keywords:

     cmp2i   Measure CMP2I snoops.

     cmp2s   Measure CMP2S snoops.

     The default is to measure both snoops.

   Event Specifiers (Programmable PMCs)
     Atom Silvermont programmable PMCs support the following events:

     REHABQ.LD_BLOCK_ST_FORWARD
             (Event 03H, Umask 01H) The number of retired loads that were
             prohibited from receiving forwarded data from the store because
             of address mismatch.

     REHABQ.LD_BLOCK_STD_NOTREADY
             (Event 03H, Umask 02H) The cases where a forward was technically
             possible, but did not occur because the store data was not
             available at the right time.

     REHABQ.ST_SPLITS
             (Event 03H, Umask 04H) The number of retire stores that
             experienced.  cache line boundary splits.

     REHABQ.LD_SPLITS
             (Event 03H, Umask 08H) The number of retire loads that
             experienced.  cache line boundary splits.

     REHABQ.LOCK
             (Event 03H, Umask 10H) The number of retired memory operations
             with lock semantics.  These are either implicit locked
             instructions such as the XCHG instruction or instructions with an
             explicit LOCK prefix (0xF0).

     REHABQ.STA_FULL
             (Event 03H, Umask 20H) The number of retired stores that are
             delayed because there is not a store address buffer available.

     REHABQ.ANY_LD
             (Event 03H, Umask 40H) The number of load uops reissued from
             Rehabq.

     REHABQ.ANY_ST
             (Event 03H, Umask 80H) The number of store uops reissued from
             Rehabq.

     MEM_UOPS_RETIRED.L1_MISS_LOADS
             (Event 04H, Umask 01H) The number of load ops retired that miss
             in L1 Data cache.  Note that prefetch misses will not be counted.

     MEM_UOPS_RETIRED.L2_HIT_LOADS
             (Event 04H, Umask 02H) The number of load micro-ops retired that
             hit L2.

     MEM_UOPS_RETIRED.L2_MISS_LOADS
             (Event 04H, Umask 04H) The number of load micro-ops retired that
             missed L2.

     MEM_UOPS_RETIRED.DTLB_MISS_LOADS
             (Event 04H, Umask 08H) The number of load ops retired that had
             DTLB miss.

     MEM_UOPS_RETIRED.UTLB_MISS
             (Event 04H, Umask 10H) The number of load ops retired that had
             UTLB miss.

     MEM_UOPS_RETIRED.HITM
             (Event 04H, Umask 20H) The number of load ops retired that got
             data from the other core or from the other module.

     MEM_UOPS_RETIRED.ALL_LOADS
             (Event 04H, Umask 40H) The number of load ops retired.

     MEM_UOP_RETIRED.ALL_STORES
             (Event 04H, Umask 80H) The number of store ops retired.

     PAGE_WALKS.D_SIDE_CYCLES
             (Event 05H, Umask 01H) Every cycle when a D-side (walks due to a
             load) page walk is in progress.  Page walk duration divided by
             number of page walks is the average duration of page-walks.  Edge
             trigger bit must be cleared.  Set Edge to count the number of
             page walks.

     PAGE_WALKS.I_SIDE_CYCLES
             (Event 05H, Umask 02H) Every cycle when a I-side (walks due to an
             instruction fetch) page walk is in progress.  Page walk duration
             divided by number of page walks is the average duration of page-
             walks.

     PAGE_WALKS.WALKS
             (Event 05H, Umask 03H) The number of times a data (D) page walk
             or an instruction (I) page walk is completed or started.  Since a
             page walk implies a TLB miss, the number of TLB misses can be
             counted by counting the number of pagewalks.

     LONGEST_LAT_CACHE.MISS
             (Event 2EH, Umask 41H) the total number of L2 cache references
             and the number of L2 cache misses respectively.  L3 is not
             supported in Silvermont microarchitecture.

     LONGEST_LAT_CACHE.REFERENCE
             (Event 2EH, Umask 4FH) The number of requests originating from
             the core that references a cache line in the L2 cache.  L3 is not
             supported in Silvermont microarchitecture.

     L2_REJECT_XQ.ALL
             (Event 30H, Umask 00H) The number of demand and prefetch
             transactions that the L2 XQ rejects due to a full or near full
             condition which likely indicates back pressure from the IDI link.
             The XQ may reject transactions from the L2Q (non-cacheable
             requests), BBS (L2 misses) and WOB (L2 write-back victims)

     CORE_REJECT_L2Q.ALL
             (Event 31H, Umask 00H) The number of demand and L1 prefetcher
             requests rejected by the L2Q due to a full or nearly full
             condition which likely indicates back pressure from L2Q.  It also
             counts requests that would have gone directly to the XQ, but are
             rejected due to a full or nearly full condition, indicating back
             pressure from the IDI link.  The L2Q may also reject transactions
             from a core to insure fairness between cores, or to delay a
             core's dirty eviction when the address conflicts incoming
             external snoops.  (Note that L2 prefetcher requests that are
             dropped are not counted by this event).

     CPU_CLK_UNHALTED.CORE_P
             (Event 3CH, Umask 00H) The number of core cycles while the core
             is not in a halt state.  The core enters the halt state when it
             is running the HLT instruction.  In mobile systems the core
             frequency may change from time to time.  For this reason this
             event may have a changing ratio with regards to time.

     CPU_CLK_UNHALTED.REF_P
             (Event 3CH, Umask 01H) The number of reference cycles that the
             core is not in a halt state.  The core enters the halt state when
             it is running the HLT instruction.  In mobile systems the core
             frequency may change from time.  This event is not affected by
             core frequency changes but counts as if the core is running at
             the maximum frequency all the time.

     ICACHE.HIT
             (Event 80H, Umask 01H) The number of instruction fetches from the
             instruction cache.

     ICACHE.MISSES
             (Event 80H, Umask 02H) The number of instruction fetches that
             miss the Instruction cache or produce memory requests.  This
             includes uncacheable fetches.  An instruction fetch miss is
             counted only once and not once for every cycle it is outstanding.

     ICACHE.ACCESSES
             (Event 80H, Umask 03H) The number of instruction fetches,
             including uncacheable fetches.

     NIP_STALL.ICACHE_MISS
             (Event B6H, Umask 04H) The number of cycles the NIP stalls
             because of an icache miss.  This is a cumulative count of cycles
             the NIP stalled for all icache misses.

     OFFCORE_RESPONSE_0
             (Event B7H, Umask 01H) Requires MSR_OFFCORE_RESP0 to specify
             request type and response.

     OFFCORE_RESPONSE_1
             (Event B7H, Umask 02H) Requires MSR_OFFCORE_RESP  to specify
             request type and response.

     INST_RETIRED.ANY_P
             (Event C0H, Umask 00H) The number of instructions that retire
             execution.  For instructions that consist of multiple micro-ops,
             this event counts the retirement of the last micro-op of the
             instruction.  The counter continues counting during hardware
             interrupts, traps, and inside interrupt handlers.

     UOPS_RETIRED.MS
             (Event C2H, Umask 01H) The number of micro-ops retired that were
             supplied from MSROM.

     UOPS_RETIRED.ALL
             (Event C2H, Umask 10H) The number of micro-ops retired.

     MACHINE_CLEARS.SMC
             (Event C3H, Umask 01H) The number of times that a program writes
             to a code section.  Self-modifying code causes a severe penalty
             in all Intel architecture processors.

     MACHINE_CLEARS.MEMORY_ORDERING
             (Event C3H, Umask 02H) The number of times that pipeline was
             cleared due to memory ordering issues.

     MACHINE_CLEARS.FP_ASSIST
             (Event C3H, Umask 04H) The number of times that pipeline stalled
             due to FP operations needing assists.

     MACHINE_CLEARS.ALL
             (Event C3H, Umask 08H) The number of times that pipeline stalled
             due to due to any causes (including SMC, MO, FP assist, etc).

     BR_INST_RETIRED.ALL_BRANCHES
             (Event C4H, Umask 00H) The number of branch instructions retired.

     BR_INST_RETIRED.JCC
             (Event C4H, Umask 7EH) The number of branch instructions retired
             that were conditional jumps.

     BR_INST_RETIRED.FAR_BRANCH
             (Event C4H, Umask BFH) The number of far branch instructions
             retired.

     BR_INST_RETIRED.NON_RETURN_IND
             (Event C4H, Umask EBH) The number of branch instructions retired
             that were near indirect call or near indirect jmp.

     BR_INST_RETIRED.RETURN
             (Event C4H, Umask F7H) The number of near RET branch instructions
             retired.

     BR_INST_RETIRED.CALL
             (Event C4H, Umask F9H) The number of near CALL branch
             instructions retired.

     BR_INST_RETIRED.IND_CALL
             (Event C4H, Umask FBH) The number of near indirect CALL branch
             instructions retired.

     BR_INST_RETIRED.REL_CALL
             (Event C4H, Umask FDH) The number of near relative CALL branch
             instructions retired.

     BR_INST_RETIRED.TAKEN_JCC
             (Event C4H, Umask FEH) The number of branch instructions retired
             that were conditional jumps and predicted taken.

     BR_MISP_RETIRED.ALL_BRANCHES
             (Event C5H, Umask 00H) The number of mispredicted branch
             instructions retired.

     BR_MISP_RETIRED.JCC
             (Event C5H, Umask 7EH) The number of mispredicted branch
             instructions retired that were conditional jumps.

     BR_MISP_RETIRED.FAR
             (Event C5H, Umask BFH) The number of mispredicted far branch
             instructions retired.

     BR_MISP_RETIRED.NON_RETURN_IND
             (Event C5H, Umask EBH) The number of mispredicted branch
             instructions retired that were near indirect call or near
             indirect jmp.

     BR_MISP_RETIRED.RETURN
             (Event C5H, Umask F7H) The number of mispredicted near RET branch
             instructions retired.

     BR_MISP_RETIRED.CALL
             (Event C5H, Umask F9H) The number of mispredicted near CALL
             branch instructions retired.

     BR_MISP_RETIRED.IND_CALL
             (Event C5H, Umask FBH) The number of mispredicted near indirect
             CALL branch instructions retired.

     BR_MISP_RETIRED.REL_CALL
             (Event C5H, Umask FDH) The number of mispredicted near relative
             CALL branch instructions retired.

     BR_MISP_RETIRED.TAKEN_JCC
             (Event C5H, Umask FEH) The number of mispredicted branch
             instructions retired that were conditional jumps and predicted
             taken.

     NO_ALLOC_CYCLES.ROB_FULL
             (Event CAH, Umask 01H) The number of cycles when no uops are
             allocated and the ROB is full (less than 2 entries available).

     NO_ALLOC_CYCLES.RAT_STALL
             (Event CAH, Umask 20H) The number of cycles when no uops are
             allocated and a RATstall is asserted.

     NO_ALLOC_CYCLES.ALL
             (Event CAH, Umask 3FH) The number of cycles when the front-end
             does not provide any instructions to be allocated for any reason.

     NO_ALLOC_CYCLES.NOT_DELIVERED
             (Event CAH, Umask 50H) The number of cycles when the front-end
             does not provide any instructions to be allocated but the back
             end is not stalled.

     RS_FULL_STALL.MEC
             (Event CBH, Umask 01H) The number of cycles the allocation pipe
             line stalled due to the RS for the MEC cluster is full.

     RS_FULL_STALL.ALL
             (Event CBH, Umask 1FH) The number of cycles that the allocation
             pipe line stalled due to any one of the RS is full.

     CYCLES_DIV_BUSY.ANY
             (Event CDH, Umask 01H) The number of cycles the divider is busy.

     BACLEARS.ALL
             (Event E6H, Umask 01H) The number of baclears for any type of
             branch.

     BACLEARS.RETURN
             (Event E6H, Umask 08H) The number of baclears for return
             branches.

     BACLEARS.COND
             (Event E6H, Umask 10H) The number of baclears for conditional
             branches.

     MS_DECODED.MS_ENTRY
             (Event E7H, Umask 01H)) The number of times the MSROM starts a
             flow of UOPS.

SEE ALSO
     pmc(3), pmc.atom(3), pmc.core(3), pmc.core2(3), pmc.iaf(3), pmc.k7(3),
     pmc.k8(3), pmc.soft(3), pmc.tsc(3), pmc_cpuinfo(3), pmclog(3), hwpmc(4)

HISTORY
     The pmc library first appeared in FreeBSD 6.0.

AUTHORS
     The Performance Counters Library (libpmc, -lpmc) library was written by
     Joseph Koshy <jkoshy@FreeBSD.org>.  The support for the Atom Silvermont
     microarchitecture was written by Hiren Panchasara <hiren@FreeBSD.org>.

FreeBSD 13.1-RELEASE-p6          April 6, 2017         FreeBSD 13.1-RELEASE-p6

Command Section

man2web Home...