Troubleshooting Tips for Citrix ADC (NetScaler)

I’ve collected numerous Citrix ADC (NetScaler) troubleshooting tips and commands over the years, so here they are. Note that some of these tools, file paths or methods may have changed over time. Also note: single\double quotes are inconsistent (sorry) and usually not needed. Note a third time: don’t copy paste from the web to cli\gui – things will likely get mucked up.

Log File Locations

ns.confconfiguration file/flash/nsconfig
ns.conf.xolder configuration file; increments after any config change/flash/nsconfig
newnslogmain log file (ns data format)/var/nslog
newnslog.xx.gzarchived newnslog file/var/nslog
ns.liclicense file/flash/nsconfig/license
nstrace.shscript to collect nstrace/netscaler
nstcpdump.shscript to collect tcpdump/netscaler
nstrace.xpacket trace file/var/nstrace
vmcore.x.gzcore dump file during a crash/var/crash
kernel.xkernel dump file during a crash/var/crash
process-piduser process core file/var/core
savecore.logcore dump log file/tmp
pitboss.debugopen pipe for debug info/tmp
aaad.debugopen pipe for authentication debug info/tmp
ns.logsystem syslog file/var/log
messagesall logged entries/var/log
auth.logauthentication/authorization/var/log
dmesg.*hardware errors/boot sequence errors/var/nslog

Authentication

The most useful authentication troubleshooter – the aaad.debug pipe. Note that it will not return to a prompt without a ctrl+c – you are viewing it in real time, so it is not like viewing a log file. You need to execute this command before someone tries to login.

  • Pipe must be open to gather information
  • Watch for ‘Sending <accept | reject> to kernel for <username>’
  • RADIUS server responses will be seen
  • If NS is SAML SP, assertion will be seen, deflated

Log strings found in /var/log/ns.log
These are mostly specific to Negotiate policies found when doing IWA.
Connection Issues

  • Couldn’t open server connection to http://1.1.1.1
  • Couldn’t create connection to ip 0xxxxx

Functional Messages (not errors)

  • NTLM: Sent NTLM Challenge to client > AFTER sending NTLM challenge
  • NTLM: NTLM auth successful!, user: <>
  • NTLM: NTLM Authentication failed for <>

Error Conditions

  • NTLM Auth: expected type1 found 3
  • NTLM RESP: Expected type2, found response code 200 is not 401
  • NTLM: Did not find Type2 from server, resetting state to 1
  • Unexpected NTLM type, 0, seen

SSL VPN Logins

  • realtime logins (CLI): tail -f /var/log/ns.log | grep “SSLVPN”
  • previous logins (CLI): grep “SSLVPN” /var/log/ns.log

Crashes and Hangs

Crash dump files are stored in the following locations:

  • Citrix ADC (PPE) crash: /var/core
  • BSD system crash: /var/crash
  • Hang\race conditions:
    • Don’t force a reboot! You need a core analysis – dump the core
    • For physical appliances, use the NMI button
    • For virtual appliances, see https://support.citrix.com/article/CTX207598 ; make sure to put pb_policy back after gathering a dump

Interface Troubleshooting

Use ‘show interface’ to determine what is happening on the network interface

  • Look at InDisc() and OutDisc()
    • Discards: appliance asked to handle more traffic than it is capable of
  • Fctls: frame sent from switch saying there is too much traffic
  • Stalls: packet is n the interface and could not get out for processing in a certain amount of time
  • Hangs: BSD checking to see if the interface is responsive or not
  • Muted: implies there is a loop; seeing the same packet on multiple interfaces

Load Balancing Basic Troubleshooting

  • Does bypassing the LB vServer work?
  • Is DNS name resolution working?
  • Check the monitor state – is an appropriate monitor bound?
  • Where is the request getting to? Does the backend server get the request? Does the network need MBF?
  • Dumb down the vServer: if SSL, does HTTP work? If HTTP, does TCP work?
  • Check persistence settings:
    • Are we using SourceIP behind a proxy\NAT? If so, use SRCIPSRCPORTHASH LB method instead
    • If SSL, use SSLSession
    • COOKIEINSERT does not work for all clients or applications
    • Try disabling persistence and use SRCIPSRCPORTHASH LB method – this may help uneven LB

Local Syslog

Logs are stored in /var/log and named accordingly; logs are compressed and rotated as per the settings in /etc/newsyslog.conf

  • Newsyslog process runs every hour via cron
  • Log file sizes must be met prior to rotation, files will be timestamped on the hour
  • See: https://support.citrix.com/article/CTX121898 to modify schedule
  • The rotation process can be debugged by running #newsyslog -v
  • *When using the local syslog viewer, always filter by module*

NSCONMSG (all the things)

Much of the (very detailed) performance data and stats of virtual servers is stored in the newnslog file in /var/nslog. Rotation of these files is controlled by nslog.sh and nsagg.conf – *modifying of these files is NOT recommended* – each appliance will have unique optimization settings for these log files depending on appliance size, platform, etc. The nsconmsg command is run from the shell prompt.

*Read the help file!!* nsconmsg -help
*Read the CTX article!* https://support.citrix.com/article/CTX113341

Common nsconmsg arguments:

  • -d <operation>
    • Current (current performance data)
    • Stats (current statistics counters)
    • Memstats (current memory statistics)
  • -K <file name> (performance information from this data file)
  • -s <name=value> (debug parameters)
    • ConLB (load balancing performance data)
    • ConCSW (content switching performance data)
    • ConSSL (SSL performance data)
  • -g <match string> (display only these symbols full pattern match)

Some nsconmsg examples (assuming archived nslog named oldconmsg):

  • nsconmsg -d current -g cpu_use
  • nsconmsg -K newnslog -d event
  • nsconmsg -d current -g ha_cur_master_state
  • nsconmsg -s ConLB=2 -d oldconmsg
  • nsconmsg -s ConCSW=2 -d oldconmsg
  • nsconmsg -d current -g pol_hits
  • nsconmsg -s ConSSL=2 -d oldconmsg
  • nsconmsg -s ConCMP=2 -d oldconmsg

Packet Captures

By default, the ADC uses the nstrace script and outputs to /var/nstrace – either CAP or PCAP file formats (use ‘-traceformat’ to specify from CLI) Can be run from GUI or CLI.

  • Use ‘-size 0’ to capture all packets (specify in zero in ‘Packet Size’ field in GUI)
  • Let the ADC decrypt all encrypted traffic in the trace with the ‘-sslplain’ argument
    • This is available in the GUI, but you must expand the More section
    • *BE AWARE* of what you are doing – saving unencrypted traffic!
    • This option eliminates the need to import private keys into wireshark
    • Note: wireshark cannot decrypt ECC!
  • Start a trace (CLI): start nstrace -size 0 -mode sslplain
  • Stop a trace (CLI): stop nstrace
  • Show the status of the trace: show nstrace
  • Capture filter for a specific vServer: -filter “vsvrname == <vserver_Name>”
  • Capture filter for a destination IP: -filter “DESTIP == <ip.address.here>”
  • Other filters:
    • SOURCEIP
    • DESTIP
    • DESTPORT
    • CONNECTION.INTF.EQ(0/1)*
    • CONNECTION.VLANID.EQ(3)*
    • *Interface\VLAN captures require the ‘-tcpdump ENABLED’ argument
  • Cyclical Traces can help troubleshoot intermittent issues by allowing you to define the length of time for each trace file and how many files before overwriting
    • Example: Start a new trace every 30 seconds and create no more than 50 files before starting to overwrite the files
    • >start nstrace -size 0 -mode sslplain -filter “CONNECTION.DSTIP.EQ(10.1.1.13) || CONNECTION.SRCIP.EQ(192.168.1.118)” -nf 50 -time 30

Performance Issues

A few notes:

  • The Packet Processing Engine (PPE) should always be at or near 100% utilization using #top in BSD
  • Httpd is the web GUI process
  • CPU reported by the hypervisor may show 100% – PPE polling mode; see https://support.citrix.com/article/CTX229555 for more details
  • Use “>stat cpu” to see actual CPU usage by PPE
  • Gather current and\or previous newnslog files
  • Citrix ADC uses nsprofmon for CPU profiling
    • Started at boot time, runs continuously
    • If any PPE CPU exceeds 90%, data will be captured to newproflog_cpu_<cpu_id>.out
    • Logs to /var/nsproflog
  • NSPROFLOG data capture parameters can be modified
    • Before using, please read: https://support.citrix.com/article/CTX212480
    • Nsproflog.sh cpuuse=700 start (will capture data when PPE CPU over 70%)
    • Nsproflog.sh lctidle=2000 start (will capture data when idle CPU time exceeds 2ms in idle functions)
    • Nsproflog.sh stop (stops the profiler and generates .tar.gz file with profiling data)

Policy Hits

This gets its own section because I use it ALL THE TIME. It will let you know which session policy or authentication policy is being hit by a gateway user (for example).

nsconmsg -d current -g pol_hits
nsconmsg -d current -g _hits
nsconmsg -s disptime=1 -d current -g pol_hits

Show Commands – Load Balancing

Useful commands

  • > show lb vserver <vServer Name>
  • > show cs vserver <vServer Name>
  • > show service <service name>
  • > show connectiontable (add: ” | grep <IP address|port>”)
  • > show connectiontable (add: “ip == <ip address> && state == established && svctype == SSL && svctype != MONITOR”)
  • > show persistentSessions
  • > show dns addrec -type proxy

Show Commands – Performance

Useful commands

  • > show version
  • > show node
  • > show info
  • > show license
  • > show savedConfig
  • > show run
  • > show hardware
  • > show interface -summary
  • #sysctl -a netscaler | more
  • #dmesg
  • #cat /var/nslog/dmesg.boot
  • #tail -f /var/log/ns.log

Stat Commands

  • > stat ns
  • > stat interface -summary
  • > stat interface <interface name>
  • > stat ssl
  • > stat cpu
  • > stat lb vServer <vServer name>
  • > stat cs vServer <vServer name>
  • > stat service <service name>
  • > stat dns <records>
  • > stat http

3 thoughts on “Troubleshooting Tips for Citrix ADC (NetScaler)”

Leave a Reply