I’ve collected numerous Citrix ADC (NetScaler) troubleshooting tips and commands over the years, so here they are. Note that some of these tools, file paths or methods may have changed over time. Also note: single\double quotes are inconsistent (sorry) and usually not needed. Note a third time: don’t copy paste from the web to cli\gui – things will likely get mucked up.
Log File Locations
ns.conf | configuration file | /flash/nsconfig |
ns.conf.x | older configuration file; increments after any config change | /flash/nsconfig |
newnslog | main log file (ns data format) | /var/nslog |
newnslog.xx.gz | archived newnslog file | /var/nslog |
ns.lic | license file | /flash/nsconfig/license |
nstrace.sh | script to collect nstrace | /netscaler |
nstcpdump.sh | script to collect tcpdump | /netscaler |
nstrace.x | packet trace file | /var/nstrace |
vmcore.x.gz | core dump file during a crash | /var/crash |
kernel.x | kernel dump file during a crash | /var/crash |
process-pid | user process core file | /var/core |
savecore.log | core dump log file | /tmp |
pitboss.debug | open pipe for debug info | /tmp |
aaad.debug | open pipe for authentication debug info | /tmp |
ns.log | system syslog file | /var/log |
messages | all logged entries | /var/log |
auth.log | authentication/authorization | /var/log |
dmesg.* | hardware errors/boot sequence errors | /var/nslog |
Authentication
The most useful authentication troubleshooter – the aaad.debug pipe. Note that it will not return to a prompt without a ctrl+c – you are viewing it in real time, so it is not like viewing a log file. You need to execute this command before someone tries to login.
CLI > shell# cat /tmp/aaad.debug
- Pipe must be open to gather information
- Watch for ‘Sending <accept | reject> to kernel for <username>’
- RADIUS server responses will be seen
- If NS is SAML SP, assertion will be seen, deflated
Log strings found in /var/log/ns.log
These are mostly specific to Negotiate policies found when doing IWA.
Connection Issues
- Couldn’t open server connection to http://1.1.1.1
- Couldn’t create connection to ip 0xxxxx
Functional Messages (not errors)
- NTLM: Sent NTLM Challenge to client > AFTER sending NTLM challenge
- NTLM: NTLM auth successful!, user: <>
- NTLM: NTLM Authentication failed for <>
Error Conditions
- NTLM Auth: expected type1 found 3
- NTLM RESP: Expected type2, found response code 200 is not 401
- NTLM: Did not find Type2 from server, resetting state to 1
- Unexpected NTLM type, 0, seen
SSL VPN Logins
- realtime logins (CLI): tail -f /var/log/ns.log | grep “SSLVPN”
- previous logins (CLI): grep “SSLVPN” /var/log/ns.log
Crashes and Hangs
Crash dump files are stored in the following locations:
- Citrix ADC (PPE) crash: /var/core
- BSD system crash: /var/crash
- Hang\race conditions:
- Don’t force a reboot! You need a core analysis – dump the core
- For physical appliances, use the NMI button
- For virtual appliances, see https://support.citrix.com/article/CTX207598 ; make sure to put pb_policy back after gathering a dump
Interface Troubleshooting
Use ‘show interface’ to determine what is happening on the network interface
- Look at InDisc() and OutDisc()
- Discards: appliance asked to handle more traffic than it is capable of
- Fctls: frame sent from switch saying there is too much traffic
- Stalls: packet is n the interface and could not get out for processing in a certain amount of time
- Hangs: BSD checking to see if the interface is responsive or not
- Muted: implies there is a loop; seeing the same packet on multiple interfaces
Load Balancing Basic Troubleshooting
- Does bypassing the LB vServer work?
- Is DNS name resolution working?
- Check the monitor state – is an appropriate monitor bound?
- Where is the request getting to? Does the backend server get the request? Does the network need MBF?
- Dumb down the vServer: if SSL, does HTTP work? If HTTP, does TCP work?
- Check persistence settings:
- Are we using SourceIP behind a proxy\NAT? If so, use SRCIPSRCPORTHASH LB method instead
- If SSL, use SSLSession
- COOKIEINSERT does not work for all clients or applications
- Try disabling persistence and use SRCIPSRCPORTHASH LB method – this may help uneven LB
Local Syslog
Logs are stored in /var/log and named accordingly; logs are compressed and rotated as per the settings in /etc/newsyslog.conf
- Newsyslog process runs every hour via cron
- Log file sizes must be met prior to rotation, files will be timestamped on the hour
- See: https://support.citrix.com/article/CTX121898 to modify schedule
- The rotation process can be debugged by running #newsyslog -v
- *When using the local syslog viewer, always filter by module*
NSCONMSG (all the things)
Much of the (very detailed) performance data and stats of virtual servers is stored in the newnslog file in /var/nslog. Rotation of these files is controlled by nslog.sh and nsagg.conf – *modifying of these files is NOT recommended* – each appliance will have unique optimization settings for these log files depending on appliance size, platform, etc. The nsconmsg command is run from the shell prompt.
*Read the help file!!* nsconmsg -help
*Read the CTX article!* https://support.citrix.com/article/CTX113341
Common nsconmsg arguments:
- -d <operation>
- Current (current performance data)
- Stats (current statistics counters)
- Memstats (current memory statistics)
- -K <file name> (performance information from this data file)
- -s <name=value> (debug parameters)
- ConLB (load balancing performance data)
- ConCSW (content switching performance data)
- ConSSL (SSL performance data)
- -g <match string> (display only these symbols full pattern match)
Some nsconmsg examples (assuming archived nslog named oldconmsg):
- nsconmsg -d current -g cpu_use
- nsconmsg -K newnslog -d event
- nsconmsg -d current -g ha_cur_master_state
- nsconmsg -s ConLB=2 -d oldconmsg
- nsconmsg -s ConCSW=2 -d oldconmsg
- nsconmsg -d current -g pol_hits
- nsconmsg -s ConSSL=2 -d oldconmsg
- nsconmsg -s ConCMP=2 -d oldconmsg
Packet Captures
By default, the ADC uses the nstrace script and outputs to /var/nstrace – either CAP or PCAP file formats (use ‘-traceformat’ to specify from CLI) Can be run from GUI or CLI.
- Use ‘-size 0’ to capture all packets (specify in zero in ‘Packet Size’ field in GUI)
- Let the ADC decrypt all encrypted traffic in the trace with the ‘-sslplain’ argument
- This is available in the GUI, but you must expand the More section
- *BE AWARE* of what you are doing – saving unencrypted traffic!
- This option eliminates the need to import private keys into wireshark
- Note: wireshark cannot decrypt ECC!
- Start a trace (CLI): start nstrace -size 0 -mode sslplain
- Stop a trace (CLI): stop nstrace
- Show the status of the trace: show nstrace
- Capture filter for a specific vServer: -filter “vsvrname == <vserver_Name>”
- Capture filter for a destination IP: -filter “DESTIP == <ip.address.here>”
- Other filters:
- SOURCEIP
- DESTIP
- DESTPORT
- CONNECTION.INTF.EQ(0/1)*
- CONNECTION.VLANID.EQ(3)*
- *Interface\VLAN captures require the ‘-tcpdump ENABLED’ argument
- Cyclical Traces can help troubleshoot intermittent issues by allowing you to define the length of time for each trace file and how many files before overwriting
- Example: Start a new trace every 30 seconds and create no more than 50 files before starting to overwrite the files
- >start nstrace -size 0 -mode sslplain -filter “CONNECTION.DSTIP.EQ(10.1.1.13) || CONNECTION.SRCIP.EQ(192.168.1.118)” -nf 50 -time 30
Performance Issues
A few notes:
- The Packet Processing Engine (PPE) should always be at or near 100% utilization using #top in BSD
- Httpd is the web GUI process
- CPU reported by the hypervisor may show 100% – PPE polling mode; see https://support.citrix.com/article/CTX229555 for more details
- Use “>stat cpu” to see actual CPU usage by PPE
- Gather current and\or previous newnslog files
- Citrix ADC uses nsprofmon for CPU profiling
- Started at boot time, runs continuously
- If any PPE CPU exceeds 90%, data will be captured to newproflog_cpu_<cpu_id>.out
- Logs to /var/nsproflog
- NSPROFLOG data capture parameters can be modified
- Before using, please read: https://support.citrix.com/article/CTX212480
- Nsproflog.sh cpuuse=700 start (will capture data when PPE CPU over 70%)
- Nsproflog.sh lctidle=2000 start (will capture data when idle CPU time exceeds 2ms in idle functions)
- Nsproflog.sh stop (stops the profiler and generates .tar.gz file with profiling data)
Policy Hits
This gets its own section because I use it ALL THE TIME. It will let you know which session policy or authentication policy is being hit by a gateway user (for example).
nsconmsg -d current -g pol_hits
nsconmsg -d current -g _hits
nsconmsg -s disptime=1 -d current -g pol_hits
Show Commands – Load Balancing
Useful commands
- > show lb vserver <vServer Name>
- > show cs vserver <vServer Name>
- > show service <service name>
- > show connectiontable (add: ” | grep <IP address|port>”)
- > show connectiontable (add: “ip == <ip address> && state == established && svctype == SSL && svctype != MONITOR”)
- > show persistentSessions
- > show dns addrec -type proxy
Show Commands – Performance
Useful commands
- > show version
- > show node
- > show info
- > show license
- > show savedConfig
- > show run
- > show hardware
- > show interface -summary
- #sysctl -a netscaler | more
- #dmesg
- #cat /var/nslog/dmesg.boot
- #tail -f /var/log/ns.log
Stat Commands
- > stat ns
- > stat interface -summary
- > stat interface <interface name>
- > stat ssl
- > stat cpu
- > stat lb vServer <vServer name>
- > stat cs vServer <vServer name>
- > stat service <service name>
- > stat dns <records>
- > stat http
Great write up Jake, all the necessary info in one place.
\
Excellent Information Thank you
Thank you so much.
Great write up Jake!
Corey Trimiar
Great write up, extremely uselful. Thanks.
Q: would you know of a good guide on extracting http headers from CS requests? I don’t believe it’s built into a logging feature without something like NSWL to capture those logs.
Thanks again,
Al
Thxs a million, Jake 🙂
Thanks for this great write up.